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THE ACCURACY OF COMPUTATION WITH 
APPROXIMATE NUMBERS 
By 
HELEN M. WALKER 
Teachers College, Columbia University 
and 


VERA SANFORD 
Oneonta State Normal School. 


1. General Considerations. 

The number of figures necessarily free from error in the result 
of a piece of computation may be determined by studying the rela- 
tion between the number of digits in the result and the number of 
digits in the maximum error of the computation. It is the purpose 
of this essay to derive rules for the determination of the number 
of digits which are certain to be correct in computations based on 
measurement, but it must be understood that these rules state the 
minimum number of correct digits so that the result of a specific 
piece of computation may be accurate for more places than the 
rules indicate. 

2. Notation. 

Since the location of the decimal point has no connection with 
significant figures in a given number, it is assumed that the decimal 
point follows the last significant figure in each of the original num- 
bers, the argument being somewhat simplified by this assumption. 
Accordingly the greatest error in the statement of the original 
numbers is + 0.5. Let A and B be the true values of two num- 
bers such that A=@-/0” and B= 6-10”, where 7 and 7 are 
positive integers and where 0.1 < @ < 1.0 and where 0.1 <& ¢ 1.0. 
Then by the convention adopted above, the number of significant 
figures in A and 8 are mand 7m respectively, and the observed 
values are not less than A ~ 0.5 and &—0.5 and not more than 
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A +05 and B +0.5. Let € represent the maximum error in the 
computation and let €’ be the value of the largest term in the ex- 
pansion of €. 
3. Products. 
The greatest error in the product of A and § will-occur when 
each is in excess by 0.5, the value of this error being | 
é€=(A +.5)(B+.5)-AB =%3 Q-10 + 5 €.10"+.25 
For at?) 0.1,AB has 777#72 digits to the left of the decimal point. 
For 4¢€¢(0.1,ABhas m+7=1 digits to the left of the decimal 
point. The cases to be considered are 


€ 3 We =n = 
( IT) 71 = 2 > / 
(IIT) m—-wte =f 
(IV) ma—n>/f 


(1) Let 77 = 71=1. In this extreme case each factor con- 
sists of a single digit and the product consists of one or two digits. 
In this case, the figure in the unit’s place is always affected by the 
maximum error and the figure in the ten’s place, when present, is 
generally so affected. - 

(Il) Let 72=72>1. Here AG=a#.10 and 

e= + (a+t)-10%.25. 


But 0.1 ae < 1.0 


—f 7 
and therefore 10” + 4 < € k (o +4 ° 


The following conditions are possible: 
Either (1) a@€ 30.1 and AB has 27 digits to the left of the 
decimal point, 
or (2) al-< 0.1 and AG has 27-/ digits to the left of 
the decimal point. 
Either (3) €has 7 digits to the left of the point, 
or (4) €has 7+#/ digits to the left of the point, the 
first one being 1 and all the others to the left of 


the decimal point being zero. This can occur only 
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when € is very near its maximum value. For 
example, when 72=4, the value of € must be less 
than 10000.25. 

Then if conditions (1) and (3) are fulfilled, the result bas 7 
more digits to the left of the decimal point than has the error. A 
subsequent proof will show that this means that at least 7-/ places 
in the result are free from error. Under conditions (1) and (4), 
the difference in the’number of digits is m-s , and at least n-2 
are not affected by the error. Similarly under conditions (2) and 
(3), 2-2 digits are not affected by the error. Conditions (2) 
and (4) cannot occur simultaneously. 





The proof that conditions (2) and (4) are incompatible with 
the conditions that 0.1 <@ < 1.0and0.1 <€ €< 1.0 may be obtained 
from fig. 1. The area within which these limits hold for @ 
and for & is the area of the square bounded by @ =1.0, a= 0.1, 
& =1.0, €=0.1, the numerical value of this area being 0.81. The 
region within which @¢<0.1 is the area below the hyperbola a=0.1. 
The region within which até is larger than a specified value is 
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the region above the line a+@-%. Therefore the probability that 
all of these conditions shall be simultaneously fulfilled is the ratio 
of the shaded region in fig. 1 to the total area of the square, or 
to 0.81. When atf » 5, this probability is only 0.000,014,8 and 
when asf > 0.55, > probability vanishes altogether. 


ain Let m-7=1. Then €-t (toatb)-/0"+.25 ; 





But 4/< soatt&<i/. 
Now either 72=1 or 7751. 
Let w=1. Then 5.75 ¢€ < 55.25. 
Either (1) @&}0.1 and AG has 3 digits to the left of the deci- 
mal point, 
or (2) a&<O0.1and AB has 2 digits to the left of the dec- 
‘imal point. 
Either (3) 10a+& < 1.95 and 5.75 C € < 10, 
or (4) l0a+&) 1.95 and 10¢ € ¢ 55.25. 

If conditions (1) and (3) are met, the product has 2 more 
digits to the left of the decimal point than has the error. Thus one 
or two places will in general be free from error. Under conditions 
(1) and (4) or conditions (2) and (3) the number of such places 
free from error is 0 or 1. By an, analysis similar to that given 
under (II) it appears that there is about one chance in ten that 
conditions (2) and (4) should be simultaneously met, in which 
case no place would be free from error. 

Let 72=2. Then 55.25< €. < 550.25. 

Either (1) @&>0.1 and AF has 5 digits to the left of the deci- 
mal point, 
or (2) a&<0.1 and AB has 4 digits to the left of the deci- 
mal point. 
Either (3) 10a+& < 1.995and 55.25 <¢€€ < 100. 
or (4) 10a+é > 1.995 and 100 ¢ € <¢ 550.25. 

If conditions (1) and (3) are met, the product has either 2 

or 3 places free from error. Under conditions (2) and (3) or 
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conditions (1) and (4), the product has 1 or 2 places free from 
error. Simultaneous fulfilment of conditions (2) and (4) will be 
rare but not impossible. However in this case, the first digit of the 
error cannot be larger than 5, hence, as shown later, the number 
of digits free from error in the result will usually be the number 
in the product minus the number in the error, rather than one less 
than that. 

For 72 > 2, the constant 0.25 forms a still smaller proportion 
of the error. Hence for larger values of 72, if 7m-=1, the pro- 
duct may be expected to have 7 or m~-1 places free from error. 

(IV) Let m-7)>1. Here 27}3 and therefore the terms 
4 6.10” and 0.25 are negligible in comparison with ZQ 10” and 
may be disregarded, since neither of them can affect the first place 
in the error. Then €’ = $@- 10” | and therefore 
05 (10”’)< €’Z 0.5 (10 ” ). 
Either (1) 4€20.1 and AB has 7m+7 places to the left of the 
decimal point, 
or (2) @&<0.1 and AB has 772 +7-1 places to the left of 
the decimal point. 
Either (3) a¢.2and €’ < 10” so that € has 772-1 places to the 
left of the decimal point, 
or (4) a>.2and10 <€’ < 0.5 (10), so that € has 7 
places to the left of the decimal point. 

Conditions (1) and (3) would leave either 72+1 or 7 places 
free from error in the product. Conditions (2) and (3) or con- 
ditions (1) and (4) would leave either 7 or 7-1 places free from 
error. If conditions (2) and (4) are met, there would be %~-1 
places in the product to the left of the first digit in the error, and 
since this first digit is not more than 5, the error is not likely to 
affect the preceding digit. See section 6. 

In general, therefore, if there are 72 significant figures in the 
less accurate of two approximations, the product of the two approx- 
imations will have 2 or 1 digits free from error, The product of 
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two such numbers should be rounded off until it contains only as 
many significant figures as the less accurate of the two numbers. 
The last digit in the product may then contain some error. 

4. Quotients. 

The greatest error which can occur in a quotient arises when 
the dividend is in excess by 0.5 and the divisor in defect by the 
same amount. 

PI - 
A+.5 A $+a-10 
a es | | 2 


B- 5 B- €-(2#:10”"-1) 


We must consider separately the cases (1) 72= 72 


(I) m>72 
(il) m2<¢7 
(1) Let m=72 . Then éta 


ie &(2¢-10"-1) 

(a) If 7=7=1 there are 81 possible quotients of one-place 
numbers, and an examination of these shows that in only thirty-one 
of these cases the first digit is free from error. 

(€) The case for 772 = 72 >1 should be studied for specific 
values of 72 . In general, however, the error is large as a->1 and 
& ~> 0.1, and is small as 2-7 0.1 and 61.0. Also as 7 increases, 
the influence of the constant term in the denominator becomes less. 


Therefore in _— 





Af 0.6 GI 
0.55 oe » © Ed —ttti— << ———_ “5 "Te 
10. ~ 210" alfaz(we™*)~1) 10" - 





If a—1.0, haeeae len then ex = mat acta 
error has at least 72~—2 zeros to the left of the first digit. In 
this case, A has one digit to the left of the decimal point, so that 
the quotient will have at least 72-/ digits to the left of the first 
digit in the error. 

If a>0.1, &-1.0, and 7 is large, then €» a5? , and the 


error has 72 zeros to the left of the first digit. In this case 4A has 





no digits to the left of the decimal, so that the quotient will again 


have 72-1 digits to the left of the first digit in the error. 
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Furthermore, € = ($+1)(2¢-10=1), OR 


connie _—— a : 1 
€ = 36-10" * ‘36°: + higher powers of 10. 


: ! , ££ — 
Then if a>, oon ge< =, - We then have 1 digit 


to the left of the decimal point in the quotient, and either 2-1 or 





7 — 2 zeros preceding the first digit in the error. 


0.55 ' / 
If att, 6-407 C€ < £- 40% 
since 0.1 »e >) 10. 
‘ In this case we have no digits to the left of the decimal point in 








the quotient, and either % or 77-/ zeros preceding the first digit 
in the quotient. 

(II) Let 77) 72. 

(a) Let m7-7=1. 


ny! mm — 
Then €= tT L eee )+¢-10 Jen} 
<~ €2 (1+ 16% \(2¢-10” ) , the higher powers 
of ( 2¢-/0") having no effect upon the first 
digit in the error. 
If a> 6 , there are 2 digits to the left of the decimal joint 
in the quotient and either 77-2 or 77-3 zeros in the error. 
If a<#€, there is 1 digit to the left of the point in the quotient 
and either 7—1 or 77—2 zeros in the error. 
Only in rare cases will there be as few as 77-3 zeros in front 
of the first digit in the error. To secure this € must be greater than 
10°”. This probability will differ for different values of 2. For 


example, if 77* 4, we have as bounding conditions, 
a> 20¢* aio & 


{10 and o.1<asho, 
The ratio of the area bounded by a= 206-101, $=. ,and 
a=I0 to the area of the square bounded by @=./, @=/Q, 
b=.) , and b 210, is b- 101+ 80, 01020) 
- 40 


PF - * | ceoet mb )a6 = 6.00054 
t= 0.) 
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which is the probability that there would be only ~» - 3 zeros 
following the decimal point in the error. 
m-n 
(b) Let 77-m>1. Then €= —&F4-/0_ 
& (26-10-11) 
This situation should be studied for specific values of 72 .. 


However an approximation may be obtained by letting 
tt - 
e’. £+a-/0 
. 26°. 10% ae 
since subsequent terms in the expansion do not affect the first digit. 


If a>f, then €’< ste" Ss other terms which do 
: 26-10% ~ 26 


not affect the first ~~. 
rn 27 


Also €’ > ae »s S$ 05(0"”). 


In this case the quotient has 772-77+#/ digits to the left of the 
decimal point, while the error has either 77-27%, 7-22 +1 or 
™m-2n+2. Consequently there are either 7-1, 72 ,or 72+ 1 
digits free from error in the mee. 

+710 


If a(t , then ibe C6'< Same 


In this case the quotient has 7-7 digits to the left of the decimal 
point, while the error has either m-2n-1, m-27, or m-2ntl. 
Again there are either -/ ,77, or 77 #1 digits free from error 
in the result. 
(III) Let 7mm. 
Suppose 2772 +/= 72 


If a2}, the first digit of the quotient is imme- 
diately to the right of the decimal point, while there are from 77-1 
to 772 +1 zeros between the point and the first digit in the error. 

If a¢€ , there is one zero between the decimal point and the 
first digit of the quotient, while in the error there are either 7m or 
777 + 1 zeros. ; 

In general, therefore, 1{ -here are m digits in the less reliable 
of two approximations, there will be either m, 7-1, or 72 +1 digits 
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free from error in their quotient. In a few rare cases, a fortuitous 
combination of digits, discussed later, may throw the error back 
into the 72-2 place. Jn general the quotient should be rounded off 
to contain only as many places as there are in the less accurate of 


the two numbers. 


5. Square Root. 


When a number, is in excess by 0.5, the error in its square 
M2 Az 
root is (A +‘) 


2k-! 
2 WA thy AE fag Boney Absit) OE 


When a mumsbes, is in on by 0.5, the error in its square 
root is (A- tz)" 
2-1 


3a - fe ne. teed .* 
eam tz A = 423A “ss * we a ~ 3) 2 a 


meer 


Obviously the greater error occurs when the number is in de- 
fect by 0.5, but in either case we may neglect all terms after the 
first. Each term can readily be shown to be larger than the term 
following it, and the ratio of the first term to the second is so large 
that the second term cannot affect the first digit in the error. 

We must consider in turn the case in which 772 is even and 
that in which 7 is odd. 

(1) Let m= 2nr, 
Then A= a-/0°* and has 22 digits to the left of the decimal 


y) 


; ‘ a ae : 
oint. A*%a™%so. and has& digits to the left of the decimal. 
p g 


ait ! 
yo. But Wy gam < qyay 6.79) 


“i ; -2 
Since 0.25 (10 “)<le | ¢0.791 (10), the error has 2 zeros 
between its first digit and the decimal point. 


~/2 


Now Je'}= “4a 


Therefore when 772 is even, the root contains as many significant 
figures as the number. 
(II) Let mt=2A-!. 
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- 24-1 22-2 
Then A=a-70 = 40a. 10 
da, We rt 
Then A = (0a). /0 andhas £2 digits to " left of the decimal. 
‘T= J hoa ta 
le | by A” = 4(toa \% 
ic caaeluana _ 
0.079 < a= < aayte Sp = 025 , 


(079) 10°" << Je") < 0.25 (10) 


The error then will affect the 2 “ place to the right of the decimal 
point, and the number of digits free from error will be 2+#2-/=2A-1 
which was the number of places in the square. 
(III) . There is also the case where the decimal point is so placed 
that the second digit in the last period ts not known, as in Y 32.4 
or ¥ 0.46825. 
= ~ fz, 

Here |€ ‘| = = A . In this case also the number of 
digits free from error in the root is the number of digits in the 
original number. 

In general, then, the number of digits free from. error in the 
square root of a number is the number of digits in the number. 
6. Effect of the Error. 


The following table will illustrate how an error of 1 places may 


affect either 72 or m+1 places in the computation: 










ERROR IN DEFECT 
6247 {5986 |7253 


ERROR IN EXCESS 


Result obtained by computation....... 




















NS aR teres cians Se aca ig ews ee Givi : . 12 
ran NN es Bi 5 oc ee sei eheraielayoinoneen 6214 [593317241 6280 16039 |7265 
Computed value, rounded ............ 6200 |6000 | 7300 6200 |6000 |7300 
THUG VAINE;, TFOUNGER 6 oc.cc.vis ccc ciccevie 6200 15900} 7200 6300 |6000 17300 


We will now show that the chances are approximately 3 out of 4 


that an error of 72 digits affects 72 and not +1 places in the result. 
For convenience we may place the decimal point to the left of the 
first digit in the error, the position of the decimal point being en- 
tirely independent of the number of significant figures in the com- 


putation. 
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Let € = error. 
d = portion of the number to the right of the decimal point. 
c = portion of the number to the left of the decimal point. 


A = the true value of the number. 
Then c+ A = result of computation. 
A= c+ad@-€ = true value. 


We will consider € to be positive when the observed value is in 
excess and negative when it is in defect. 

We will consider separately the case where the computed value is 
in excess and the case where it is in defect. 


Suppose the result of computation to be in excess 


1. (a) Then if d>.5 and e>d-.5 } the error will affect n+l 
(b) or d<.5and €)d+.5 


2. (a) If d > Sand €<d-.5 the error will affect only n 
(b) or ad ¢ .S5and oe 


places in the result. 


places in the result. 


3. (a) lf d > Sand €=d-.5 the error will affect either 
(b) or d< Sand €=dt+t.5 nor ntl places depend- 
(c) or d= Sande Yo ing on whether the last 


digit of < is odd or even. 
This is on the assumption 
of the usual rule, that in 
rounding off the digit 5 
the previous digit is made 
even. 


Since the number of digits in € is finite, the values of d and 
of € form discrete series, so that we shall have to think of d=.5 
not as an infinitesimal but as a finite portion of the scale, ranging 
from d =.495 to d=.505 when n=2, from d= 4995 tod =.5005 
when %=3, etc. If we map the region bounded by d=0, d= 1, €=0, 
€ = 1 the proportions of area representing conditions (1), (2) 
and (3) represent the respective probabilities of these three sets of 
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. conditions. As 72 increases, the width of the strip d=.5 becomes 
smaller, the probability of (3) becomes smaller, and the probability 
of (2) approaches 3/f . 


When #=2 these areas are respectively 


DU OGD ve ciccncacxciiaccacccedenne 245025 

DR OEE nica cnsinssnihucsanciasdee 4 735075 

BERD Fi MBE) ncccccccsccdaseencs 0199 
1.0000000 


We may assume that the last digit in C is as likely to be even 
as to be odd, we may say that the probability that the error will 
affect #+1 places in the result is slightly more than ff when 
there are two digits in €. This ratio will approach /// if the 
number of digits in € increases. 

A similar argument holds when the result of computation is 


in defect. 


7. Summary of Rules. 

On the assumption that an error of 7 places affects only 7” 
places in the result we have the following rules: 

If the less accurate of two approximate numbers contains 72 
significant digits, their product and their quotient each contain 772 
or 72-1 significant digits. 

The square root of a number contains as many significant fig- 
ures as the number. 

About once in four times, the error will affect one more place 


than these rules state, for the reasons given in section 6. 





COMBINING TWO PROBABILITY FUNCTIONS 
By 


Witiram DoweLt Baten, 
University. of Michigan. 


The object of this paper is to show results which arise from 
combiriing two probability functions in finding the probability 
function for the sum of two independent variables. The first part 
presents the sum function when the probability law for each indi- 
vidual variable is “one-half” of the Pearson Type X law. From 
this law arise certain ideas concerning the Beta function which are 
not presented by texts treating this subject. 

The second part presents some peculiar probability functions 
when special laws for the individual variables are considered. Here 


certain laws with infinite discontinuities are combined. 


I. The probability function for the sum of 7 variables when 
each is subject to the function e : 

Let the probability that the chance variable x, lies in the in- 
terval (x, ,%,+@x,) be to within infinitesimals of higher order 

L( x,)d¥, and the probability. that the chance variable x, lies in 

the interval (x,, ¥,+-dy,)be to within infinitesimals of higher order 
9%) da % » where x, and x, may have respectively any real value. 

By a well known theorem, the probability that the sum, 
X,+%, = #, lies in the interval (z, 2+dz) is, to within infinitesi- 
mals of higher order, 


Fieydz= ff fox): (a-x)dx,- de. 
Let vs) 


att 
$(%)=e " for — (0,0) 
af elsewhere, 


and 
x, 


3 (x)= e for '@ oo) 


O elsewhere. 
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According to the above theorem, the probability function for the 
sum, X,+%, = z is 


? 
7x _(2-%, 
F() = fe 7 - ie 
- z#€ for (0,<) 
= O elsewhere, 
which is a Pearson Type III function. The probability functions 
or laws for X, and X, are discontinuous at the origin, while the law 
for the sum is continuous from minus infinity to plus infinity. 
By using LY) and gos), the frequency function for the 
sum, Xt+Hy+H,= 2, 1S 
= _yx -G-»%) 
E (2) -Jxe € ad x 
: 26 /e for (0,09) 
=O , elsewhere; where *,+%,=4. 
In general, if the probability function for the individual vari- 


able x- is 
c 


f (%) = e > for (0,7) 
= © , elsewhere, 
TL 
then the probability function for the sum, 2X, = Z 


is 


-{ di z 
fi) =(2 e )Jor-1)! for (0, 2) 
= © elsewhere. 
This is also a Type ITI law. Others have studied this law and have 


obtained functions for the sum and the average.’ 


1 Mayr— Wahrscheinlichkeitsfunktionen und ihre Anwendungen—Mo- 
natshefte fiir Math. und Phys., Vol. 30, 1920. p. 20. 

Church—On the mean and squared standard deviation of small samples 
from any population—Biometrika, Vol. 18, 1926. pp. 421-394. 

Irwin—On the frequency distribution of the means of samples from a 
population having any law of frequency with finite moments, with special 
reference to Pearson Type II—Biometrika, Vol. 19, 1927. pp. 225-239. 

C. C. Craig—Sampling when the parent population is a Pearson Type 
11]—Biometrika, Vol. 21, 1929. pp. 287-293. 

A. T. Craig—On the distribution of certain Statistics—Am. Jour. of 
Math., Vol. 54, No. 2, 1932. pp. 353-366. 

Baten—Frequency laws for the sum of 7 variables , which are subject 
each to given frequency laws—Metron, Vol. X, No. 3, 1932. pp. 75-91. 


[a 
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The purpose of developing this law for the sum of n indepen- 
dent variables is to show how certain finite summations are eval- 
uated. An interesting summation arises when f and gq are inter- 
changed in certain cases. For example the law for the sum, 

z= 2 xX, 


a= 


is F (2) , and the law for the sum, 


mt! 


z % 
= a ¥;: is SEO) $2)dx= | $e F (z-x)dx - 


= 
—~X -Z2tx 


(a) F @w-:-e “a | oe yx Id 
neon ® ) (n-1)! ° . = , +6), Cz sid *€1) x = 


= "a" [ z. CN Ca Oe ow il 


=O, elsewhere. 
Since the probability function for the sum of the first m+/ varia- 
bles, when each is subject to f , is 
c 
m -z i ‘ 
(¢) Ze /nl , for the positive axis 
then'(a) and (b) are equal and the summation in the above ex- 


pression for (a) is equal to Yrr ; hence 


> nr 
= (-!)  --1 Ca. ~ Too 
Ss A+! 


If the probability function for the sum of the first 2% varia- 
bles is obtained by “combining” the probability function for the 
sum of the first variables with the probability function for the 
sum of the following 1 variables, another interesting summation 
arises. This summation is a Beta function in disguise. For exam- 
ple the probability function for the sum, X,+*%,+%+%*, >= # , is 

3 */ 3! for positive Z and zero elsewhere, and the probability 
function for the sum, Xo+ Xt X,#X_= V , is ve /s! for 
positive v and zero elsewhere. The probability function for the 


& 
sum, Z+v- 2 y = Ww, 
t= 


. a -¢ ~wte 3 
” Fow)= 7, fe € 2° (w*awe+sw2t 2) dz 


-w 7 
ty- 5+ .-17) w =; for the positive axis 
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and zero elsewhere. The quantity in parentheses has for numera- 
tors the coefficients of the binomial (@- ¢)* while the denominators 
begin with a number greater by one than the exponent of the bi- 
nomial and increase by unity from term to term. The above form 
suggests the following integral 

= Cth he 3 3 

— ie * Jy CG-x)dx= By). 

In general the probability function for the sum of the first 272 

variables, by using the probability function for the sum of the first 
nm and the probability function for the sum of the following %, is 


2n-l_w wn-t (1) C 
we Mienkk for (0,e0) and zero elsewhere 


(n-1)! (m-1)4 Bp At , ’ . , 
The summation can be written as a definite integral 
a 
-1 
a) Se 
z. oO atl fx" x (i-x)” dx = Bin,n)= Ze . 
Azo fan) 


If the probability law for the sum of 72 independent variables 
is obtained by combining the probability law for the sum of the 
first S variables with the law for the sum of the following »-S 
variables the following summation arises which is also equal to,a 


Beta function. This ee is 


MS"! (ns n-S-! 
2 ea, Ci) mata “fs x "(-x) dx =B(s, n-s). 
This idea concerning the Beta function appears to be new. 

II. Combining two probability functions. 

Combining here shall mean finding the probability function 
for the sum of the variables from the probability functions of the 
individual variables. Many peculiar functions arise when various 
laws are used for the probability functions of the individual varia- 
bles. This section presents a few of them. 

Let 

F (x)= , for (0,1) and zero elsewhere, 
and ‘ 
9 (4) = 3(1-2y) . for (0,1) and zero elsewhere. 


These laws are drawn below. Both have two points of discontinuity. 
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The probability function for the sum, X+4= 2 is 
(427- oz + 32) , for the interval (0,1) 
F@) = C424 182°-272414) , for (1,2) 
oO, elsewhere. 
F(z) is continuous, symmetrical about the line 2=/ with large 
slope at the points (0,0), (1,1) and (2,0). There is a cusp at 
(1,1). A) is drawn below. 





422 62*+3z -4.25.16=*-274 +14 





2. Let K« )= <n , for (0,1) and zero elsewhere, and 
gy) = virg , for (0,1) and zero elsewhere. The function 
4X) is the probability function for the square of the variable 
if the probability law for the variable is unity in (0,1) and zero 
elsewhere. The function fry approaches infinity. at the origin, 
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while 9(4) is a similar curve turned in the opposite direction and 
has an infinite slope at (1,0). 


x(x) 


, i 
) Nua 
_ | —— elaine aac cantante 
Qo (429) x 4a 








The law for the sum, x+#Y4 = # is 
Yy° tq farv2 /G-vz)] , for (0,1) 
F(z) = 4 Wy -4eg [(s-2+2V2-2 V2z-1) |, for (1.2) 
O, elsewhere. 

FG) is somewhat of a surprise for it is equal to zero at the origin 
and the point (2,0) and approaches infinity from the right and from 
the left at the point (1,0). The slope of the law for the sum is 
infinite at the origin and at the points (2,0) and (1,0). F@) 
appears below. 





‘WET. 
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3. Let $=! , for (0,1) and zero elsewhere and g (4)= / 
for the intervat (0,1) and zero elsewhere; then the prebability law 
of w= x is A(w) = sw , for the interval (0,1) and zero 
elsewhere. Let u= y* , then the probability function for u is 
k( u) = siz , for the interval (0,1) and zero elsewhere. 
According to the theorem used in part I the probability law for 
the sum, x + y= 2 .& 

W/y , for the interval (0,1) 
Fiz) = 4° arc cos a , for (1,2) 


oO , elsewhere. The plot of F(z) ‘s below. 





Z “(-EF-2) 
get's a 


The functions A (w) and (wu) are J~shaped functions with 
infinite slope at the origin and are equal to f$&) in example 2. The 
law for the sum of the squares in this case has one point of discon- 
tinuity which is at the origin. The function for the sum is constant 
throughout the interval (0,1) and is equal to an inverse cosine 
function throughout the interval (1,2). 

4. If f(x)=3(1-2x) for the interval (0,1) and zero else- 
where and 94) = 3-24)” for the interval (0,1) and zero else- 
where, then the law for the sum, X+ ¥> =, is 

(6 (827 402-807-6024 15z) , for the interval (0,1) 
Fa) =4 6 (-82% 402" 302+ 100#-952446) for (1,2) 
Cc elsewhere. 


? — 
The function /(@) has three modes and has its highest point 
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where one would least expect it, and has large slopes at the origin, 
and at the points (1,0), (2,0). To appreciate the nature of Fe) 
here tne graphs of the functions for x and y should be examined. 
They are U-shaped curves which are tangent to the horizontal axis 
at the middle of the interval (0,1). See the second figure in 1. 
F (2) is shown in the following figure. 


Fiz 





-6(ez2402%e2" O(- 8254 40.2 4+ 8027+ 100 22-952 +46) 


-60 22+ 752) 


C20) (ZO) 





ON THE SYSTEMATIC FITTING OF STRAIGHT 
LINE TRENDS BY STENCIL AND CALCULATING 
MACHINE 


By 


HeErBert A. Tooprs, 
Ohio State University 


Whenever there is only one plotting point corresponding to /V 
successive abscissal values equally spaced, it is possible greatly to 
simplify the fitting of straight lines to the empirical observations. 
Let the N several absicissal values (ordinarily time) be 
| i. Bes Mee ****, Rew 
Let these several X values be replaced by a series of transmuted 
steps, x; (c=4,2,3--, w). 

Let the several corresponding ordinates be y, , y, , +"; G- 
The situation is represented in Figure 1. 


Fic. 1. Illustrating the Notation Employed. 





x'----- 1 Z 3 4 a eee 
Mm ---- X, x, x, %y ¥g---- 
Letting the equation of the fitted straight line be 
(1) ¥= Qatrtx’. 


it is well known that the solutions, by least squares, for the two 


— are, ce Zy: Z(¢)?- 2x Z xy 
(2) ~ nw Z(x')?- (Ex')? 
and 


(3) ge MZKG- ZHZY | 
N 2(x')*- (zx')* 


? 
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: t . . : : 
Also that, inasmuch as the X coordinates are an arithmetical series, 
; ; vt 
we may substitute in the above for x’ and Z(x’)” as follows: 
4 ‘4 
( ) ZX = Zz 
z 
(5) Z(x')= tw (vti\(2nt1), 


v(t), 


thus yielding, 
(442) ZY -6Z2XY 
N(N-1) 
(7) b= $e2Zxy-Clvrizy . 
N (N11) 
which equations, if of infrequent usage, are highly serviceable. It 


(6) a= 


is possible, however, to proceed to the derivation of formulae still 
more useful for systematic fitting of straight line trends. Thus, 


there being only one ordinate to each abscissal value as assumed, 


(8) 2 Y- 5 * a? |e > Ges 

9 ‘y= ae 

OD Zxys 1y +2y, +39;,4--- + WYy, 

It will be observed further that the denominator N(w-/) of (6) is 
invariably an even number, and therefore exactly divisible by 2.' 
Substituting (8) and (9) in (6), and then multiplying both numer- 


ator and denominator by 4, we obtain 





(10) az ae 
aie 
5 
an equation which is a function only of the several ordinates and 
of MW . Furthermore, this equation when solved for specific values 


if N leads toa system of equations remarkably simple; and more 


over one easily extended indefinitely. 


Thus, when 


(11) w=2, a, = + (2q-~%,) 


1 The desiderata are: 1. To obtain a formula which shall obtain as 
small multipliers of the several Y's as possible. consistent with 

2. Integral multipliers, and 

3. An integral numerator and denominator for the A and B (i.e. if 
and when the Y's are integral). 


(2V4I)(Y FY, + - FY,)-3(1YF2Y +--+ wY,) 


> 
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4 
(12) When w=3, @,= 3(4y4+14,-2y) 


(13) When W=4, ay= ¢(oy tay, toy -3y,) 


etc., etc. 
The symmetry of arrangement is more readily grasped if the sev- 
eral coefficients of the Ys , and the denominators, D, , be collected 
into an orderly table thus (Figure 2) :— 
Fic. 2. Systematic Solution of Equation (10), for Specific Values of : 
for Finding @ 
y= at $x’. 


Rue: Extend and cumulate the successive y's by the stencil multipliers 
of the row of the table appropriate to the problem (determined by 
WN ) in question. Divide the accumulated sum by the denominator, 
D, , of the same row. The resulting quotient is a 


Multiplier of the ordinate :— 





Having such a table at hand it is obvious that @ may be 
determined quickly by 

1. Simply choosing the appropriate row of multipliers for 
the number, W , of successive plotting points available*; and 
2. Extending the several 45 by the appropriate multipliers, 


ae 


most conveniently done by calculating machine ; 


. 


3. Dividing the sum so obtained by the appropriate divisor, 
D,. 


2 Obviously if any plotting point intermediate between y and Yy is 
missing it must be supplied (by interpolation) before employing this method. 
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The multipliers may be extended indefinitely for larger and 
larger values of N by simply noting that the diagonal marginal row 
increases by the successive addition of -1, while columns increase 
by the successive addition of 2; and rows decrease by the successive 
addition of -3. ‘The denominators have a constant second order 
difference, A = 1, and consequently may be prolonged readily. 

Let us now return tot , equation (7). The denominator 
w(w=1) is always divisible by 6. Hencé, substituting (8) and 
(9) in (7) and dividing both numerator and denominator by 6, 


we obtain, 





In like manner, this equation when solved for specific values 


of wW leads to a systematic series of equations: 


(15) t= +(-14,+4,) . (where w= 2) 
(16) ¢; = + (-2 y, toy, +2 Y,) ( where N= 3) 
7) f= (-3y,-1y, +14, + 3yy) (where w=4). 


The corresponding table yields Figure 3. 


Fic. 3. Systematic Solution of Equation (14), for Specific Values of N , 
for Finding & 


“y= at & x’. 


Rute: Extend and cumulate the successive Y¥5 by the stencil multipliers 
of the row of the table appropriate to the problem (determined by 
/V_ ) in question. Divide the accumulated sum by the denominator, 
Dy , of the same row. The resulting quotient is t 









eo 













' 
SIA MNP WH | Om DY 


' ' 
CDS Nw SS 


\ 
! 


‘ 
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\ 
wW 








(14) = ” (nt (4,4 9,4 444°" + Se) 
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& 
e 
3 
} 
: 


The extension of this table is readily made by observing that 
the diagonal marginal row increases by adding 1 ; the columns, by 
adding -1, and the rows, by adding 2; while the denominator has 
a constant third order difference, a2 =1. 

For hand computations these two tables, Figures 2 and 3, are 
undoubtedly simplest because the multipliers are smallest. If, how- 
ever, a calculating machine is available, the magnitude of the mul- 
tipliers is of relatively small moment if anything is to -e gained 
by using different multipliers. It is obvious, for example, that the 
several multipliers of a row may be divided by the appropriate 
denominator, the resulting decimal multipliers, to replace the pres- 
ent integral multipliers, being presented in tables of W columns or’ 
sections. 

An even more useful set of tables for general purposes may 
be derived by reducing equations (10) and (14) to a common 
(integral) denominator, so that the same denominator may be 
employed for calculating both a and ¢. . 

We may obtain the least common denominator, whan by 
multiplying equation (10) by a ; and, equation (14), by mul- 
tiplying by 2, thus: 


’ (w+) [Qamti(grytt ¥,)-3Cry tZY te tY )] . 

(18) a= ee . 
2. 

oo G14 tay +--+ wy,)- 3VtIKY, #4 +---- +y,) . 


(19) = 
w (w*-1) 
z 


Accordingly, it follows that if the three following changes be ef- 
fected, we shall have an integral system: 

1. The previous table values of @ to be multiplied by (+) 
of the row in question throughout. 

2. The previous table values of & to be multiplied by 3 
throughout. 
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3. The common denominator, a=) , of any row in ques- 
tion to be made to be 3 times the previous denominator of t of 
the row. ' 

The two sets of multipliers may now be combined into one 
systematic stencil (Figure 4) with a common denominator PD or 
common reciprocal, S . The directions for using this stencil are 
as follows :— 

1. Count the number of plotting points. 

2. Find the row of the stencil having the same number of 
plotting points, (/W ). 

3. Record the y values for the successive plotting points 
in the little rectangles of the row just located. 

4. Using a calculating machine, obtain the summation of the 
extensions of the several y- values by the multipliers just tmme- 
diately above, employing a fixed decimal point. 

5. Divide the sum just found by the divisor, D , at the left 
hand of the row. The result is &. 

6. Similarly obtain the symmation of the extensions of the 
45 by the multipliers of the several respective windows tmme- 
diately beneath, again employing a fixed decimal point. 

7. Divide the sum thus obtained by the same divisor, D . The 
result isa. 


8. Substitute values of @ and & in 
, 
(1) “y= a + OX. 
If we summate (1) we obtain the checking equation, 


(20) Fy< [2va tne) €] , 


a 
~~ 
Sante <a 5 ’ 

since ee ee z vv tt). 

Now, let us employ the revised stencil on a problem (of perfect 


fit) :-— 


2 SEE 








HERBERT A. TOOPS © 27 





x 4 x. 
(Age) ( Attainment ) 

3.5 12.72 1 

3.3 22.45 2 

7.5 32.18 3 

9.5 41.91 4 
11.5 51.64 5 
13.5 61.37 6 
15.5 71.10 7—N 

=x= @&.3 23.37 = 24 


Fic. 4. Revised Stencil for Solving Formulae (19) and (18) for anda, 
respectively, “ = arty. 


: 6(1Y, +2Y, + +t MYy)-3 (Wt Yt 4, 4°"' + Jy) 
(19) & = N(N?-71) 
i 


(wes Li C2wtily, +9, tt Yy)~ 3C0Y t2y, te ty) 
(8 eo ———— OOOO 
_— 


>. Aw(w-t) 
D = — 


Multiplier of Ordinate No.:— 
mM DBDiitzZzgesiea&se6: 7, 6 FP @ 


=-— 3 
2 3 
6 -3 
= 0.6 
3 12 
16 4 -8 
-—) -5 | © 


30 15 - 


-— -4 @ 6 
5 60 
48 30 12 -6- 
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Fic. 4—Continued 


ai +f of 5 © 
6 105 
70 49 28 7-14 -35 








3-12 -6@ @ 6 2 
168 


NI 


96 72 48 24 0 -24 -48 
=71-15 -9 =3. 3. 9 Th. 2 
8 252 
126 99 72 45 18 -9 -36 -63 
-24 -18 -12 -6 0 6 12 18 24 
9 360 
160 130 100 70 40 10-20-50 -80 


OPA 


~27 28-45 -@ -~-5 5 CHA ZF 
10 495 
198 165 132 99 66 33 0-33-66-99 


— 30 -24 -18 -12 -6 0 6 12 18 24 3 
11 660 
240 204 168 132 96 60 24 -12 -48 -84-120 


33 27-23 -15 -O -5 5 Oey FF ss 
12 858 
286 247 208 169 130 91 52 13-26-65 -104 -143 
Since the X-coordinates are replaced, for computation, by the 
series, the following transmuting equation prevails: 
‘ 
(21) S* 3k .79 . 
The stencil set up, employed for seven 4s 


-13 —— -6 oO 12 18 


= — 24 O a -48 





D=168 


t= ie -18(12.72)-12(22.45)-""" + 18(71.10)] = 9.730 


a4 is [9 (12.72)+72(22.45)+--- ~48(710)] = 2.990 


whence: 


_ «= 2.990 + %730X 


SL 
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At this stage, application of formula (20) proves the correctness 
of this equation. 
Now substitute for x’ its equivalent (5x +.75) and 
= 4.865 x~- 4 3075, | 
which may be checked by summating, 


Z y = 48652x-43015 N= 4.865 (66.5)-4 3075 (7); 
ce. 293,37 = 293.37 . The check holds. 











STATISTICAL ANALYSIS OF ONE-DIMENSIONAL 
DISTRIBUTIONS 


By 


Rosert SCHMIDT 


The present research is to be considered as a contribution to 
a range of science in which the pioneer work has been done by K. 
PEARSON. The method for analysing statistical distributions to be 
developed here differs in principle—as far as the author can see— 
from the known ones. The mathematical resources are all well 
known and so simple that their deduction ab ovo could be carried 
through on a few pages; hence this investigation is intelligible to 
anyone who remembers his mathematical knowledge acquired at 
school. 

The main resource consists of the process of orthogonaliza- 
tion, fundamental in the theory of integral equations. The central 
idea characterizing the following is, not to deal with a frequency 
function itself, nor with its integral function, but with the inverse 
of the integral function. The general scope will be given in No. 3. 

The author is indebted to his wife and to Mr. J. L. K. Grrrorp, 
M.A., of Queensland University for kind help in revising the 
English text. 

1. DESIGNATIONS AND GENERAL ASSUMPTIONS 

A curve ¥= 90), (0{x<+0) shall be called a “frequency 

curve’, the function 9) a “frequency function”, if Q(x) satisfies 


the following conditions: 


l. 9x)20 (-2<x (+00) 
2. The moments 1“, = Sx* qe) dy exist for k=0,1,-- 1 
3. Ada I. 


For our purposes it is convenient—though not necessary—to 


1In this paper we shall not have to make use of the second condition 
(except in the special case K= © ); in further notes, too, the condition will 
never be applied to its full extent. : 


meee mer roe 





nc aan eae mcm cee e 


SE EIT PY ON MO PID: 


SPP oe nT ne pe 


add a fourth condition which it is simplest to formulate by using 
the function 


This function is constantly increasing in -o <x<+a , and 
lim = lim WF 

ouee $e) O, $e) 

The fourth condition is to guarantee that d(x) assumes every value 


from oO(y¢! 


function in the ordinary sense. 


we have 


4. a) Pix) is continuous 
b) At every x where o< Pix) <!, ’ P(x) is in- 

creasing (strictly speaking), that is: From x{x<x" it always 
Gu’) ¢ Pod ¢ $x"). 

When the conditions 1 - 4 are fulfilled, let us denote Px) as 
the “frequency integral” of the frequency function @). 

Then there exists one and only one function yy) in ofy<{i , 
wrdm) = x (0< dm) 
and yy) is called the inverse function of pt) . This function py) 
is continuous and constantly increasing (strictly speaking), and 


follows that 


satisfying 


therefore possesses a unique inverse, namely $&): 


We give here some special examples of frequency curves. 


I. The “Step Curve”. 


/ 
The moments are ja = Cr= 9,1,2,-°: ). 


The frequency integral is 


The inverse is 


y= Por] 


dow) =< * im ogxd! 




















ROBERT SCHMIDT 31 
x 
dor= _f garde. 


X¥—+O 


just once, so that ¢/x) possesses a unique inverse 


Ply = ¥ (0< yd). 


1 in o¢ ug 
o otherwise. 


(4) 
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Il. The Normal Law of Error. 





with the moments 


(an)! 


ZF nl for K= 2n. 





fbn = 


O for K=2m#1 (m=91--- ) 


§ 


and the frequency integral 


= 
dey Se * | 


There are a number of tables of the numerical values of this 
function. Of course these tables can be used to compute the values 
of Y¢y). Considering the fact that, for our purposes, the values of 
y(y) will often be required for simple rational arguments only, it 
seems useful to have tables which are converse to those just quoted, 
that is to say, the tabulated entry of which is x= W(y) and the 
argument y = gx) . Such tables have been calculated by KELLEY 
and Woop (Statistical Method, New York 1924; Appendix C). 

III. The Laplace Curve. 


y= Fo = ~~ 


(2n)lex/ for K>=27 


AL, = 
© for K=2ntt (m=0,1,---) 
‘ x + 
ze nm —-e9 Lx <o 
$x) = - . 
i-te in O€£ X (+00 


( fog y + tog 2 in o< 4th 


P(4) = - fog (1-4)- bog 2 in Y2 $< 





TIME ORE PR TER SUERTE NIM 8 I am 


cota 


: 
: 
i 
: 
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IV. The “Tine Curve”. 
O in —c Kx <-/ 
1+X = in ~#¢xuiteo 
7-7" : in O€ xd! 
oO 


in +1 £ ¥ <+00 


oO in —-ookx¢-I 

2 . 
Bix) = ZOrx): * -1 gx io 
-3(i-x) in og x Xt! 


/ in +] { xX (+00 


I+ Vay in Of YK 2 

va) 4 sy in on <y Li 
2. EKKeE’s “Best VALUES” 

A. EXKKE, in his Kiel dissertation (to appear), deals with the 
following question among others: Suppose a frequency function 
&) andanatural number 7 given. Which one among all systems 
of m values X,,--:,%, might be considered the “best” ?—To give 
an answer to this question, EKKE divides the total x~ axis into 72 
parts ms ' Bee with the panning points x -.-- 


‘> d 


a manner in vs ; 
Sg ax . Soa . = [oa a % 


Evidently this is eile in one and . "one manner, and 


ee a a a n-! 
G) ) ~ . y(z), Tan” 9(72 = ) 
Each of the parts A, ! 
the system. Furthermore it seems reasonable to fix every point X,, 


%,,_, in such 


we have 


rs in. should contain aii one value of 


, 


within its interval J, by the conditions 


ol ae dy = JSamde, paees yf Fede =, {9 dx. 
This : also can be tone 3 in one and ie one way. Let. us designate 
these “best values” by §,,---, 5. We have 


 §= 4G), cna), , 4 OR), 
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Concerning the best values, EKKE proves two theorems which 
accentuate the rationality of the definition. If x, 4----£X, are 
values arranged according to magnitude, and 


Oo in -“RKL¥KX, 
S(% %°°°%,) = ¥, = MEX h,, (YI, n-4) 
/ in %,Z “x <too. 


the following theorem holds: 


“There is one and only one system X,,---; X, for which 


St $900 - S05 Ho %)} dx 


assumes a minimum, and this system is x,=5,,-°°,%,=5, .” 
This theorem also holds if the exponent 2 is replaced by an arbi- 
trary positive number.—Furthermore: 
“There is one and only one set x,,---, ¥,, for which the low- 
est upper boundary of 
| Go) - S063 4 XD | 


assumes a minimum, and this set again is identical with & ,--, 5, .” 
For normalizing purposes EKKE considers, together with a 
given frequency function ), the totality of the frequency func- 
tions which result by linear transformations of the argument, i.e. 
which result by translations and dilatations in the direction of the 
x- axis (or by choosing new origins and new units of measure- 
ment). With an arbitrary 8 , and +) 0 , we have to form 
Piya t P(x), 
the first factor = being required in order to comply with con- 
dition 3. The od corresponding to Pow) is 


= $(Elx#)), 
P (4) = © Ply) +. 


Due to this simple relation between W(y) and Vy) , we have 
evidently, if g, ne §,. designate the best values of F(x) , 


~v ~~ nw 


Senses, Bombs, > 8+ 8078, 


and the inverse 


es 


— 


AIA EE CITE 


REET 
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This fact can be used to pick out from the multitude of functions 
gw) a distinct specimen, and then to operate with its best values 
only. It is easy to show in a direct manner that there is exactly 
one specimen in the multitude which cornplies with the additional 
conditions A, = 0,4 = 1. 

3. THE STARTING PornT. GENERAL SCOPE. 

But the proof of the fact just mentioned can be given indirect- 
ly too by considering the inverses Puy), and it is this way which 
gives the starting point of our further developments. Indeed, if 
we introduce — for simplicity — Stieltjes integrals, the conditions 
M, = 0, 4, = 1 mean 


oi sa e a 
Sx dhazo, xd Guy=1, 
and by the substitution x= Wty) we get 
a ~2 
J Puy) dg 0, J Fly) dy- / 


or 
S@r« Y(y))dy=0, Jee yey) dy = e 
Let us put 
and oH =', hOB = Pon 


X= wey) 
XG) ARG) + YY). 


Then our conditions are equivalent to the following demand: Find 
coefficients ox ; % (~de , 90) in such a manner that 


iG (4) nad / SX Xqhay=0, [xt dy =/. 

We add: The functions Y(y) and (y) are linearly indepen- 
dent, i.e. A, Py) +2, Ply)= 0 cannot hold except for A, =A, = 0. 

Now it is obvious that our demand represents a special case of 
the general problem as follows: Given a set of linearly independent 
continuous functions Y(y), Ply),°**, % (4) (0¢4«!) . The scheme 
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of coefficients 


eaneeeneeme eee @®@e ef Fe we wae Ke 


satisfying the additional conditions Ba, >°; {5,,>°, ont, Bx >, 
shall be chosen so that the functions 


Z(4) = Reo BY) 
Z, (4) =/B, BY) +, ¥Y) 


mw KM ew RM RK — 


Zp) = Feo e+ By BY)t +B, YC4) 


form a normalised orthogonal system, i.e. 


1 for #~= ¢ 
JZ, A.) dg = ie for pt $F 


It is well known that there is one and only one suitable scheme, 
and it is furnished by the so-called process of orthogonalization. 
Furthermore it is well known that the process of orthogonaliza- 
tion is intimately connected with another problem: An arbitrary 


continuous function Fy) given, to determine the coefficients ¢ 


0) 9 <& 
2 
so that Jf Fa) layne aie Oe rc, ¥.(4)]f dy 
° 
assumes a minimum, 
Concerning frequency functions, we are led — by pursuing 


this line—to a general theory of curve types; an account of the 

results to be obtained will be given in a future article. 
Concerning our analysis of statistical data, we do not intend 

to use from a given frequency function more than its best values. 


More precisely: we intend to replace the frequency function by its 
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best values. Our modus procedendi now results by analogy: we 


have to deal with systems of values (vectors) 


(u,, ? Wis, i ’ Us») 


which are linearly independent (see No. 5). We have to employ 
the process of orthogonalization, which gives a normalized orthog- 


onal system (see No. 6) 


(w Win *** 7, Mal 


1 
(w, » “ee,’° °*» W,; n) 
(w,, 5 Wr’ tS ) Wow) 


and we have to direct our attention to the sms of the form 


/ - - 
nm -_ fu,-(cuyte Feu 


v= 
or better 
2 
ww 
eZ Ea Gee mh, 


Finally we have to introduce the special set of - vectors 
(1 a 1) 
( 6, ) _ oo En ) 


-—-— —- = eo 


K-! wt oii 
( 4 Se «8% y é... ) 

where &, - & designate the best values of a frequency function. 

We are now in the position to characterize the direction of 
our research in general words: 4 statistical analysis of distribu- 
tions as an application of the theory of orthogonal systems, based 
upon the best values of a given frequency function. 

4. VEcTORS 


For our ptirjiose it is Convenient to make use of the notations 
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and simplest operations of vector analysis. If u, ,u,--,u, are a set 
of numbers, we take the symbol (u,----u,) as an individual, call 
it a vector, and designate it by a gothic letter : 
Ab = (U,*+*, Un): & 
‘qualt 4 = eee : ae 
™ ity of two vectors At =(u, 5", u,) and 40=(y,---, v,,) 
is defined by 
uw=aYy, U,= Ss *** ", UHM 
and is written #=0 . The products of anumber < with a vector 


2 


Av are defined by 
CAL = (CU,,---, Cun) 
ALK = (UC, +++ Une) ; 
the sum of two vectors # and. 40 by 
At +40 = (u,ty,, a U,+M%, ) . 
Evidently we have 
Cit 2 MX 
and . 
Ct +40)+ Mo =M.t+ (+40). 
Hence we may omit the brackets, and the sum of three or more 
vectors has a definite sense. More general, the meaning of the 
expression - = 
c,uW,t-:°: tL, Ux 
is Clear. The product of two vectors is (somewhat differently from 
‘the customary way) defined as a NUMBER, viz. 
Ait = Le (UY 0+ Fy Vy), 
and we have a ss 
WA AO = HNN 
CH +20)mMO = A 1u0+4) MO. 
But in general the vectors (44,0)wo and A000) are entirely 
different. 


Let us put wa « (0, 0, +++,0). 


Every vector AL satisfies At =At at >0, and #2” is the only 
vector for which 4% =® holds. — Whenever the square root 
of the square of a vector, 

v2 u>+ sina 2 


AA = allie og e 
ze i 





i 
' 
' 
hi 
5 
6 
: 
F 
i 
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is met with, we always mean the positive value. 
5. LINEAR INDEPENDENCE 
A set of vectors M, ,* + hl is said to be linearly independent 
if the equation 
AM, + AaA,m, te tA ML =e 
does not hold except for A, = A = =A, =0. Otherwise the 


vectors are a? to be nny depeudons. If tthe vectors 


a 
are linearly independent, “all the more the same is true for every 
partial system. Especially : 
v_ Y- —~ 
MNM#+A, M,¢0,--°' NFA. 
THEOREM 1. “Let Mm, »*, My, be linearly independent ; form 
the vectors 


vu © Y 

MN, = a,, ; 

v* yv os 
(2) _ = Aa, MW, + a,, : 

M,, = Ay, GF xg “% Ft t Aa, ; 
and suppose 

2,#0, @,,# 0,:---, a,.,#0. 


Then the vectors M,",---, At are also linearly independent.” 


In fact, if there were a relation of the form 


* * 
A, 4 +: +d, MH = 
and the factors 2,,---, x, were not all equal to zero, then there 
would be a /ast factor differing from zero, say r, , and we should 
have * uv * 
A, A 2 tA maw (A, #0) 

uw vn ° 

if we were to replace 44, ,---, #, by the expressions (2), we 


should get a relation of the form 


Phy AG be Hy Mh, +a, A,% =A, 


tS 

which is impossible on account of @,,#0, A #0 and the presup- 
posed linear independency of 4, ,---, 44, - 
In order to prove some further theorems it is convenient— 


but not necessary — to make use of the following fundamental 
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theorem concerning systems of homogeneous linear equations. 
“A necessary and sufficient condition that the system of equa- 
tions 


a, A r---°*t @,, 4,20 
~~ *>***" tye hy ** 
should have no other solution than _* 4 e+-8 x= 0 1s 
a, me Ay " 
#O. 
An," Ann 


From this statement at once follows: 
THEOREM 2. “A necessary and sufficient condition that the 
vectors 


ft = Cm,» ig, **** pig 
© Cs, Deees***'s Heal 
should be linearly independent is 
4M, Un ” 
#0 
Un, * nn 


In fact, linear dependence of MM, >” a, is equivalent to the 


existence of values X,, it, his not all equal to zero, satisfying 


A,u, +A, u,,+---+ A, U,, = 0 
A, U,, + ALU," of dn ua ** , 


and the determinant of these equations is equal to the determinant 
above. 

THEOREM 3. “If Mh, ,, My are linearly independent, the 
number « of the vectors cannot exceed the number nr of the com- 


ponents: KE. 


We prove this theorem by showing: 


a 


gh) PEP ORE A ee: 


ah MELTS 


Pare Mabie Be Bet 





ROBERT SCHMIDT 41 


“If ntti vectors 


v 


mM, = (U,,°--, Un) 


-_-—  _— = 


v 
=, * ( Unt) Unts yn) 


are given, they are linearly dependent.” 


For obviously, the determinant of the equations 


See ve oO 


A, u te--# A, | =O 


AO ++ Age @ = 0 
vanishes, hence the system possesses a solution A, gree, a dif- 


ferent from @---.0 , and with such values Aye, us the 


? > 
first 2 equations mean 
A, mM deiedilal Mong A, =. 
6. NORMALIZED ORTHOGONAL SYSTEMS OF VECTORS 

If “4° =/, the vector 44 is said to be normalized. Every 
vector 44 40 can be normalized by multiplying it by re 

If t20-0 , the pair of vectors 47 and 40o is said to be 
orthogonal. The vectors AL, ‘ At, are said to form an or- 
thogonal system, if every pair of them is orthogonal. 

Finally the vectors AA, ++, AM, are said to form a normal- 
ized orthogonal system, if they form an orthogonal system and each 
of them is normalized. Accordingly a normalized orthogonal sys- 


tem is characterized by the conditions 
/ for f? = t 
O for #t q - 


Vectors forming a normalized orthogonal system necessarily 


(3) Ln = 
My, = 


are linearly independent. For, from 


A, A #s-- tr, 2h, =O 
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follows 
v vy YY ~ 
(A, 4, +->°- +A, 4, ) Ab, = 7 4,20 (1=1,2,-- k) 
or | 
vu v vu wv a : 
A, My; My t+ t Ay HM, Ab, = 0, 
and from (3): 
Ayweo = (m2 1,2,---, #). 
7. THE PROCESS OF ORTHOGONALIZATION 


“ Y . . 
THEOREM 4. “Jf the vectors me **, A, are linearly inde- 


« 
pendent, there is one and only one scheme of values 
GB, , >o ) 
4, f,. (B,, > © ) 
i Me *** > le (Bee) 
so that the vectors 
Add = /4, At, 


(4) AO, = fy, At r fis A, 


= Z ~ x 
MO, = flns MM, + Sus mM, + ere t wie AX, 


form a normalised orthogonal system.” 
To prove this theorem, fundamental for our analysis, let us 


consider 
J 
40 = f 
J 
AQ = ww + AL 
(sy) 2 Ny : 
v ~~ L 
AQ = i, My, .. |— et ey: 


From theorem 1 it follows that the vectors 47,°°*, 4, are 
linearly independent. — Let us assume we have already proved that 
there is one and only one system of values y so that (5) is an 
orthogonal system. Then it follows firstly that the coefficients 3 


in (4) can be chosen in at least one stiitable manner. For we have 


A, = Vig” DOs) X= V09,* ? oO, 








and 


oe > i, ek ecco = ,% 
A= n> Fa 0 Fan ¥> 5° yA, a, 














SME MERE es 
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are suitable values. Secondly we can deduce the uniqueness of the 
coefficients 4 in (4). For suppose 2 and 3 to be two suitable 


systems of coefficients ; this suggests that we form 


7: y = Bu y. = Fas... : yy = Bes eX . es 











x.’ ..* = 2 ? ssh 
Baa! 3s ss Kt Bar KI GO? 
associated with 
, / 
AO => 400 : s=— “WD 
' 4, es? J a fa KD 
and WW 1c . 
* ~ * 
* ” * 7 
- 3, . Y a A, Y = i. sas ag Y ~ B., ag y = Fs 
y ~ ——— 2 ns 7? * ? » ——) » ; =) 
al 3.—CUB 32 B kl a KiK-4 3a 
22 33 33 fon ee 
associated = 
' * * 1 * 
¢ we ay, * ** +, AO, zx re, 
a“ _ 
The vectors ‘die ----AQ, as wella 40”): Ae form orthog- 


onal systems of te type (2), hence 
* 
Ao," * </, ee & 24 , AQ, = A. 
and hevtheomane 


* 
Ma 2 MA, 44, MO, = AO, 


Finally, —_ of the linear independence of 7% ,-- -:, At, 


a * 
3: = /f, 34, =/4,, > [F, 22 laae°°** "9 13. lex * Shyer 
Saanillaiie we may confine ourselves to proving the existence 
and uniqueness of suitable coefficients y’ in (5). 
This proposition is true for k= / . Let K>2Z arbitrarily, and 
assume the proposition to be proved up to k-/ . The vectors 
4Q,:+:, 42, therefore are orthogonal, and we have to show only: 


There is one and only one set of values ee ry yX so that 
in ’ Ki? ? ie Ket 
the conditions 


(6) 24-0 Ma Ghnd.++s. 4 020 
: « ? 2 é«K I J wK-p “KW 
are satisfied. 
The vectors Ab ,* “+ Men can be represented as linear com- 
‘ 


binations of 42, 


Mo 
2 -2 





wl k- 
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We introduce this into #4, , and get 


7 - ee ”. 
(7) = C, + t Oy 4-4 Qe, + 


with 
Cy - .. +t Fon%**** Les Ve, wet 


(8) CKe = Sa *°*" * Sea,2 Te, -1 


om W-! = . Vx, K-1 
From the linear independence of AQ, , AQ, we have 


Ay s a, SO, and therefore we can deduce from (7): 


vy 





— ee ow Se . 
(9) i ‘ni Vx0,” >? ) oe V0.2 
The coefficients ‘°° Ne a having to satisfy the equations (8) 
with the values (9) of Rs cee co x-, » there can exist only 
one suitable system atte ee an 


Conversely: if Cy,,---, C are chosen according to (9), 


6,8 
and then . t. es calculated from (8), evidently there re- 
sults a vector 4), satisfying (6). 

We add: 

THEOREM 5. “If Ms, ,'', 4%, are linearly independent, and 


Mi, ,-*, AQ, 1s the corresponding normalized orthogonal system, 


then the normalized orthogonal system MO, 5 Mo, correspond- 
ing to 
u* Yu 
na = a, 1, (4, do) 
‘ 
Us v Vv 
oo a,, ta,,% (a,, 70) 
v * vy v 
My, = A, M4 Oy Mat tO MG (4,.2¢) 
is identical with MQ ,°*;“O, .” 
Obviously the vectors “o™ are of the form 
* 
WO : B, Ms (B,>o ) 
J (B,.>°) 


-“--21*-7-rfc- 


mo = By, tt + By 4 +--+ Bi A (B, do) 


2 KK K 





SPOOR TREE EAI EI OSE OE 


1 RT, 













{ 
' 
f 
f 
: 
t 
¢ 
$ 
: 








ROBERT SCHMIDT 45 


The proof of theorem 5 now follows as an immediate application 
of theorem 4. 


8. COMPLETE SYSTEMS OF NORMALIZED 
ORTHOGONAL VECTORS 

A system of normalized orthogonal vectors 0 ,-°:, #0, is 
said to be complete if, corresponding to every arbitrary vector 7%, 
there exist coefficients £,,°°*", L,, so that 
(10) { Ab - (44g + +h, 109, )} =e 
holds. Evidently, (10) is equivalent to 

A= 4, MO +--+ & AW, 

THEOREM 6. “If the vectors mQ,-,Mo, (k=) form a nor- 
malized orthogonal system, then this system is COMPLETE.” 

Proof. According to theorem 3, the ™+/ vectors MOQ. Me, , am 


"9 9 
are linearly dependent, i.e. there is an equality 


A, M0 +042, A+ AM =A, 
and A,,: “y a , A are not all equal to zero. The vectors “0 , 
being linearly independent, we have necessarily A+#0 . Hence 
Ab =~ henge — An vo. 
The condition K=7 is also necessary for completeness, but 
we shall not have to make use of it. 
- 9. APPROXIMATION IN THE MEAN 
Let us consider a normalized orthogonal system MQ,--, 40, , 
and an arbitrary vector 4. We wish to determine the coefficients 


& 


n> &,. in such a way that Kk 2 

fa - £4} 
assumes a minimum. If there exists a suitable set of coefficients, 
we say that the corresponding linear combination § 12 +--+ f, MO. 


gives a “best approximation in the mean” to the vector “1. 
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The following transformations will at once clear up the sit- 


uation: 
a ig’ 2Zt 2 te 
(t-£ baafe a cb nagar S66 pg 
2 K 2. K kK ois Fe 
=a Z mayttY + Z (wapit)-2 Z hyaagat b 
2 a] v 2 Kk v 2 
=m — Z (mM AY+ Z fh - a, 4}, 


and if we designate 
vy “ we 
Q,>tOA, A, = “O,nH,+:: > Ay = “On 
we have the fundamental — 


(11) jai - z b,10,1 = a ne e (6,-a,) . 


24 


On the right hand, the coefficients b, - G,. are not met with 
but in the last sum, and this sum assumes a minimum for 6, = A, 
only. By that, we have: 

THEOREM 7. “Among all linear combinations of the normal- 
ized orthogonal vectors 9,--, 44, there is one and only one which 
gives a best approximation in the mean to the vector 4, and the 
‘best coefficients’ are 2, &..,°**, Me _ 


The equation (11) admits some important conclusions con- 


cerning the coefficients a,,---, @,. By putting 
t, = a,, iad &,, = aK 
we derive 
12 {A ~ mae 
(12) Ab ~ z= Oy, Me, 2} a Z 


The left side beetles evidently is not negative, hence 
2 z. z < y 2 
(13) 4a +Aat----+a &#H. 


Finally, if “0,---,,, is a COMPLETE system of normalized 

orthogonal vectors, the preceding reasonings of course hold for 

every K=/,Z,-+-+ 7 ~. But wecan show more than (13), viz. 
2 2 v2 

(14) QA, +a +t-+Q, =4 


1a ARN AS RAR 


PEO SIR ITED LEED 





TNO STIR TEP IO NTA RTE I RRA 
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Indeed, according to the definition of completeness we have 
with suitable & ----, & : 


[a-Z4m}no 


and a fortiori, by theorem 7, 


- Te 2 
{ A - = a, 10,} = o, 


vel 
which is, regarding (12), equivalent to (14). 
10. THe TCHEBYCHEF COEFFICIENTS 


Let £< BoM KS, 


be a set of best values corresponding to a given frequency function 
G(x) (see No. 2). We form 


é=( 1, iy xe, / 


© 


“+ (4, €--, é.. 
4+ (€, @---,; & 


me - 7 ef 


(an (<,, ere ) a} 


The vectors é ger 4 


n-1 
means A, 6, — Anes Cn ial 


2 "-1 
A, +4, By ta Eee, S *<¢ (vely, n), 
that is, the polynomial ai 
P(x) = A,+ ae ; + Ag, ® 
of degree 4 (m-1) possesses different zeros S45°° - €5 ‘ 
But the number of zeros of a polynomial cannot exceed its degree 
unless all coefficients vanish. Hence A, = A, eecr se AF oO. 


(15) 


wv “= NY 


are linearly independent. For 


Let us designate the (complete) set of normalized orthogonal 
vectors corresponding to E, gee, c., by a. 3 potty Sent ‘ 

When we have to deal with a set of observations x,-.., X,,, 
there will not be any practical loss of generality if we assume these 
values arranged according to magnitude, 


Xx< “% 4°** & Mes 


( 
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and to be not all equal. Then we define the vector S by 
€. (%, Boat? **, % J, 
and we propose to call the coefficients 
a,-36, 4:36, LU? oni b 
“Tchebychef Coefficients’ of ©. 
The central position of the Tchebychef coefficients for analyz- 
ing purposes is pointed out by the following theorems 8 and 9. 
THEOREM 8. “The set Zo Jos** > Jon and a fortiori the 
Tchebychef coefficients Q,, @,,+-, Mis of the observations 
X,, 45°" X_ do not depend on the special frequency function Gx), 
but on the type only to which GG) belongs.” 
To prove this theorem, let us consider, besides Ys) , an arbi- 


trary individual of its type, 


The best values aineiaing to x ¥ (see No. 2) 


B= Sty, Eee Ents 


and we deduce, if , aw eh dudguene the vectors (15) ob- 


Pod 





tained from > ‘ instead of .. - 
b, . G. 
, = LE, - 8 ©, 
E, = BE, t+73+%E+E, i 


2 


n-1 
é,, = /3" €, +(*; 3° ~, Pao ms * 
Hence, by | an application ‘of theorem 5, the normalized orthogonal 
vectors .,* petra i are identical with 3,,--* , 3,,, 
If we choose a new unit of measurement and a new origin, 
that is to say if we perform a transformation 
%€ 
“7 F076 es £,«/9 (+70), 
the vectors Bas** “Ts do not change (by the reasoning just 
finished). The vector € changes into 


E"s 4 €E+Bn (4 = €44---,0)), 


and we have: 
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THEOREM 9. “Jf a new unit of measurement and a new ori- 


gin are introduced, say 


X =x Xt (+ do), 


then the Tchebychef coefficients change into " 


x ~ # 
Q,=%a,t+/3, 4,=7A,, A,=7A,,°°°',4,,>Ta,, 
"a 11. MEAN AND DisPERSION. COEFFICIENTS 
OF SKEWNESS AND KurRTOSIS 
Preparatory to the definition in this chapter, let us consider 
@, and @, especially. To begin with, we have 


o *¢. = C44-*;, ') 
and therefore . 
f 
Q,- HUN +K He tM). 
The proof of theorem 4 furnishes a convenient way to com- 


pute 3, . We put 


H,=€, 5 GVE,+E, 
and determine y so that Fe 4, =z @: 
te - She E(B rt Sn). 


With the designations 


» + 
mys (Set &) mak (Bee 82) 

we obtain . 

€,E, =, , ' . + 

Hence 


A) 


Fae a mm, - 772," 
get ‘ae ~m, E+) 
cae 7, » q. - 772, » Sn7 7%) 


ie + &m ‘inne eo 
701, ~ 797,” 
Concerning @, we have now to deal with a theorem which is 





of the greatest importance for our purposes. 
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THEOREM 10. “The Tchebychef coefficient 2, is always pos- 
itive: ” 
a, oO. 
For a proof we can proceed as follows: if we designate the 
components of 3, by a seas S , we have 


S.<45,.¢--- <5. a S,to7: #San te: 


From this we deduce the existence of a subscript Vv so that 


.. ae Se Log ha* cs . 


Let us put 


2-5, oo, it a ane 


Then we have 
4 J ees ‘ 
mee, ) 2, <o , <2. ¢* (4. 7 0, 
which gives 
<0, B46,---, £2 <e. 
On the other hand, the identity 
6. x Free «. * ~* (%-%)- 2, (x,-%,)- ole — (x-X,,) 
holds. The differences %-% ,++-,X,-%,_, are all ZO, and 
%,*°:, X,, being subjected to the condition not to be all equal, at 


least one difference really is positive. Hence 
LL. 
a, = be (Sx, +++ Sy Xn) > 9. 


There are no restrictions for the Tchebychef coefficients dif- 
ferent from @, as faRas their signs are concerned. 

The reader, after having verified the truth of the following 
statement, will now be prepared to accept the definition below. 


“If the vector G is of the form 


= 4, €,+G Et t, G2 >» 
the sign of @, coincides with that of 6, ; if it is of the form 
E-tb+66+6b+66, 
the sign of 2, coincides with that of  ; and so on.” 


DEFINITION. “A type of frequency function being given, the 


Tchebychef coefficients @,, 2,, @,, Qa, of the observations 
X,,X 


2) 


Zz 


*% ee shall be called: 


2B OE NOIR 








eae ae 


& 
t 
€ 
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=M =MeEaN of the Observations 
= 0 = Dispersion of the Observations 
TCHEBYCHEF COEFFICIENT OF SKEWNESS of the Ob- 


a, 
a, 
a, 


a 


servations 
TCHEBYCHEF COEFFICIENT OF Kurtosis of the Ob-, 


PP 


servations.” 

We do not believe the Tchebychef coefficients with a higher 

subscript than 3 to be of any practical interest. 
12. MEASURES OF SKEWNESS AND KuRTOSIS 

No matter how the mean and the dispersion of a set of obser- 
vations are defined, the dispersion will always have to depend on 
the unit of measurement, and the mean furthermore on the origin. 
But the case is a different one concerning the concepts of skewness 
and kurtosis. Here it is reasonable to raise the question for meas- 
ures in the strict sense. It is obvious that such measures will be 
obtained if the set of observations is—by a convenient choice of a 
new unit—brought to the dispersion 1; the new Tchebychef coef- 
ficients of skewness and kurtosis will be suitable. This leads to the 

DEFINITION. “With the designations of the preceding chap- 
ter, the ratios 22 and e shall be called: 


Qa, 4 
a 


= K = Measure or Kurtosis of the Observations.” 


MEASURE OF SKEWNESS of the Obseryations 


|S 9} 


There will be no misunderstanding if we use the words 
“Skewness” and “Kurtosis” instead of “Measure of Skewness” 
and “Measure of Kurtosis” —Utilizing theorems 8 and 9, we have 
at once: 

THEOREM 11. “The measures of skewness and kurtosis de- 
pend on the type of frequency function and on the observations & 
only ; they are independent of origin and unit of measurement.” 

13. MEANING OF SKEWNESS AND KURTOSIS 
To secure an idea of the mechanism of skewness and kurtosis, 


let us construct some examples which show these phenomena in 
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complete purity. We will use the step function, and we intend to 
choose the values %,--- -,X, so that they are affected—apart from 
the inevitable dispersion—in the first place with skewness only, in 
the second place with kurtosis only. 

We take m= /0, and for the convenience of the reader we 
actually write down the vectors 3,,°: °°, Za ° We observe how- 
ever that in practice one will never evaluate these vectors, but 
rather compute the Tchebychef coefficients in the direct manner 
described in No. 15. 


We obtain 
























1 1.65145 | — 1.43388 
1 .55048 | t .47796 
1 .27524 | + 1.19490 
1 .82572 | + 1.05834 
1 1.10096 | + .40968 
] 1.10096 | — .40968 
1 .82572 | -— 1.05834 
1 .27524] - 1.19490 
1 .55048 |—  —-.47796 
1 1.65145 | + 1 





1.43388 






| 


We shall have to come back to these vectors in No. 17. For 
this reason they have been calculated more accurately than is neces- 
sary here. 


la. Positive skewness. 


! ° 
6 = 3,+ 5 32 (a=! »A,=t5 >a, =0 otherwise). 
1b. Negative skewness. 

= -¢ = a-t: 2 otherwise). 

6 dn 5 bz (a,=1, *~ 52 4° ise) 
2a. Positive kurtosis. 

j ae ae . - - 
tL =3+*@ (a=), 4,>+76)4%,=0 otherwise). 


2b. Negative kurtosis. 


—— ie « a i 
t *3-4) (a,=1, a> - 10.) 4,7 otherwise). 
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The components of the different vectors € are put together 
in the table below : 


— CONAN WH | © 





To illustrate the preceding, we compare the vectors @ with 
their corresponding “best systems of best values’, that is to say 
with the vectors _ 

= 2,3, + % F, >» 
and carry it through with some figures. We place the components 
of € ona horizontal straight line I, the components of & on a 
second straight line II below: 


la. Positive Skewess 
See SS 








Danna 

















I A x — x —— « 
- / ‘ 7 . a , } 2 
Pl i ‘ \ . ‘, \ i f a 
Iz cf ‘ . . % %. ‘ : { * 
2a. Positive Kurtosis 
z Sr ae 
II ‘ / f ; ‘ 
2d. Negative Kurtosis 
I 








pape es 


1 ———_—_——_>— 44 44> 
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The reader should settle his mind upon the fact that the gen- 
eral behaviour of observations affected with skewness only or kur- 
tosis only is always the same, 10 matter which type of frequency 
function is considered——The meaning of skewness and kurtosis 
can be, generally speaking, expressed by : 

Positive Skewness = Overconcentration to the Left 

Negative Skewness = Overconcentration to the Right 

Positive Kurtosis = Overconcentration near the Mean 

Negative Kurtosis = Underconcentration near the Mean. 

14. MbrasurReEs OF APPROXIMATION 

Let 7 be a type of frequency function, € = (x,,--- ,%,) a set 
of observations, and &>/ a ene of approximation”, that is 
the subscript in the sum a, s + OZ, . The expression 

LE - (4,3, ‘+@ Joh 

will give us a clue to the ae of approximation to the vector € 
which is obtained on the basis of the type 7 and the degree of 
approximation K . but the expression above of course is not yet 
fit to be taken as a measure of the quality of approximation. There- 
fore it will be necessary in the first instance to modify it so that it 
will become not only independent of the origin, but also independent 
of the unit of measurement. 

Regarding theorem 9, and making reflections customary in 


situations of this kind. we are almost compulsorily led to the 





DEFINITION. © Tlie values 
M - a, “"s -+ ar 
k= j ro (n= 1,2,---,77) 
shall be called MEASURES OF APPROXIMATION OF THE DEGREES K 
TueoremM 12. “Jhe measures of approximation M, depend 


on the type T and on the observations & only; they arc independent 
of origin and unit of measurement. Furthermore they satisfy 
P f r s " 
og MM ~ fr} ‘ se ews =/ 


All is clear if we write utilizing the relation (14) in No. 9— 














NRO ee 


5 
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Mm, in the form 


2 2 az 
z A, + Q,ters+ ess +A, : 
~* = ee eS Ge Se 2 
a,ta,+ * @. 


paying attention to theorems 6, 8 and 9. 

If M, is not much smaller than 1, the approximation of de- 
gree kK will be estimated to be good. If T and T™ are two types 
of frequency function, ™%, and Me the corresponding measures 
of approximation, and if 4.2%, , we say: T" is, for the de- 


greek , better than T (equivalence not excluded). If /%, "> 


+ Mk > My , we say: T” is, up to the degree K , better than 7 . 

Clearly we may base upon these concepts a method of curve- 
fitting. A full account will be given in a future note. 

15. CoMPUTATION OF THE TCHEBYCHEF COEFFICIENTS 

If the vectors 3,,°**', J7x are already known, the finding of 
a,,***, A, is, according to their definition, very simple. But the 
actual calculation of % ,.--, 3, is embarrassing, especially if n is 
large. We already mentioned that this can be and should be avoided, 
and we recommend the following procedure. 

We form, just as in the proof of theorem 4, 

= Ee 
2.26" 
oe? +: ’ ™ t “by ) 

and to Zinn ‘the coefficients a we > demand that the vectors 9 
be orthogonal. Let % be an arbitrary subscript among 4),---,K . 
Then at least it must be true that 


(16) hpr*--- + Regare 
and a fortiori 
By (KM, ie * Ss Jun) = 


for sien am Gq, . But & 4 oe, = are linear 


9G, 
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combinations of 4, . Ay , hence 
0) TD x1 
0 0 


(17) 


ae, 


=O, 


y Fu t.., 


= ©. 





For abbreviation let us designate the moments of the best 
values bas" e. by 


nh nr a 
™,* ‘ (6, . é.t+&) 


Obviously we have 


é,€ 


and the equation (17) produces 


, Me oy Mur ee 


(18) a 


must have exactly one solution 7 


Ko 


= Mogg (7 G42 0 
TPL. rr 
17 Ne yt 
™, a ee mM, Ne, x-t + Mt 4, 
mL 


es Le* a ‘, ree 


Conversely, from (18) follows (16), hence the equations (18) 


2X-2 


Concerning the normalizing factor in 


2 


a, 


s 
ra 


=(™m, 


and from (18): *° 


2 


Ay = My ees oe 


oe ae ae 


lue* oo Vic, x-1 lls 
+ ( ™, .** 


+ ( > 


1 4 
‘? §. 


hss 1S Se 


2X-/ 


With the abbreviations 


KEE, Ke Ge 


we have 


& 


/ 


Xx 


o 


A, 


ae 
a Ix 


xX 


c 


(yx + %) 


(7X 


+ 


$ 
' 


wL — 
2X-1 Mx. xs t 


o 


x,X-1 


(2-0, 1.++- ). 


ve” * bie 

a h 

i x , we have 
x 


an 


+t 7 =O 
Vx, eI sat 


CRI EMRE 


2 
x, 1 Ox r x) 


+ Nee 


Mte,,) 1 t 


wt, ) 


2x 


+ 27, 


> 


x * 
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For the calculation of @,@,,--- we recommend operating 
according to the following recipe, in the demonstration of which 
we confine ourselves to the most important case K= 3. The modus 
procedendi for other values of the degree « will be clear. 

1. Compute 


5 e wen sas e 27-! 
€.= Va), S,°PG)---, $= 9(38). 
In the interest of, the accuracy of the results it is advisable to 
take care that the equations 


BLE er )eo, RCE &) = 


are precisely or approximately satisfied. This will be the case if 
4,29 My?! hold; otherwise introduce 


re +f +8 tee) fer é +3 


instead of ‘..; “ £. , with convenient constants *r) @ ‘ fh. 
2. Compute 2 
To » 12,,°**, 7%, 3 ho, Ms Ee 
Again it is useful to take care that the equations 
: devas, . is 
EK thy )eo; WON +h det 
are precisely or approximately satisfied. This will be secured if 


%,'**°>) X%, are distributed over nearly the same interval as 


ee ***s in* 


3. Form the scheme 








: Boo %, 2.2 %o3 — eS OM, 
; Aro ay, a, aAr3 sy ™m, 77, «7s my 
i Axo A, a2 Sex] m, my My 75 
Aso Sy Asn 9 m, ™y ™M™ ™M 
: 4a. To every element of the second, third and fourth row in 
i this scheme add the corresponding element of the first row multi- 
plied by ~Le Azo , and - <2 respectively, so that there results 
~ G9” Boo Geo 
a scheme 
oo a,, on %o3 
oO af, Af, is 
° a3 ay 8, 
oO as, ag, 4, 
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4b. To every element of the second and third row in the 
scheme 
af, aj, 4; 
af, aj, ai, 
as, a, 4, 
add the corresponding clement of the first row multiplied by 


¢ t . 
- an and - 23, respectively, so that there results a scheme 


as, ay, 
¢ 
af, ay, af, 
a “ 
oO Qy, as, 
4 hi 
Oo a>? a 33 


4c. To every element of the second row of the scheme 


‘eg az; 
ay, ass 


add the corresponding element of the first row multiplied by - 452 


“ ’ 


aad 


so that there results a scheme 22 


“ ‘ 


( Q, Bes 
at 

O a,,/- 

5. Extract . 


A, ¥a., A, =% a, . A,- la, A Vay, 


> 22? 33 
6. Multiply the elements of the first, second and third row in 


the scheme 
(" a,, ao Qo3 


‘ , ¢ 
o a,, a,, Chis 
oO o A, Qs; 
by .t-, —_~ » and —s- respectively, so that there results 
~ Aoo a,, Az? : 
a scheme 
! t- £. 


B 


and cxtract 


7. To every clement of the first row in the scheme B add the 


corresponding element of the second row multiplied by -f, , £0 


FAT TT 


A TR RL OPPS 3 












5 Ae RT RENEE ee 
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that there results a scheme 


ory 
and extract 


Rte. i °-% 


8. To every element of the first and second row in the scheme 
BR add the corresponding element of the third row multiplied by 


, . 
-,, and - A respectively, so that there results a scheme 
”“ 
i 1 0 6 &, 


B = Oo / o &.. 


oo t y,/ 


and extract 


” t 
te * Aes i= -%s ; 
9. Form 


x 
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16. CONTROLS OF COMPUTATION 
It is easy to point out controls for the process of evaluation of 
a,,U,,A,,A,, which do not require any considerable extra 
work, and yet indicate every occurring miscalculation with almost 
absolute safety. Such general controls, of course, can not bear 
« /, « 
upon the ascertainment of &,,-*, oe 
A. Controlof 7m,, ™,, 72, ™, ; - 
4 a 
mM, + 3(1,+77,) + 1, = Z(t 6). 
3. Control of ty, 7715, 7%, > ™ 
, oo 
1g + 3(771,+™,) 4 7, = & 
C. Coutrol of xX : x, x, a 
7 
oe ae 
X+ 3(X%+X,)+ x, ~ h “ (1+ &, )- xX . 
D. Control of € ; 
/ z J 2 
’ — 
wake «4 2 (14%). 
FE. Control of y~ » yw y 
T Ne , hoy Ja> - You, Tez > 
The operations indicated under 3 - 8 in No. 15 are essentially 
nothing else than the solution of three systems of linear equations 
for one. two and three unknowns respectively, contracted into one 
uniform process of reckoning. Hence we can make use of the 


method of control by sums. We have to add the sums 


So = Fug t--- + Myz 
S. = deee + a 
3 o 33 
as elements of a fifth column: 
a ee Qos | 
A230 Qz3 


$s 
and to transform this expanded scheme in the way described in 
No. 15. Then everywhere the sum of the first four elements of 
each row must equal its fifth element. If this is true for the scheme 
' . . . 
B' especially, it is practically impossible that o> Be should 
have been wrongly computed. 
F. Control of ~,%,, %5%; 


The computation of %,--:-) 


3 should he performed by 


starting from the scheme 


OLE SEER TI TE IY IS PI 


















LTR TMS NTT OOO NR PRE 
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4 Ne eo Ye 


“i , Ney Vor S; 
oO oO / a 
Oo oO oO ! > 


with the meaning /+ Y,, + Yj, + Too * Ss. 


ith th 7S 
ity, = S. 
1 = S,, 

Multiply the elements of the first, second, third and fourth 
row by X,, xX, z. and x respectively, and form the sums of 
the elements of each column. The sums of the first four columns 
are %%,% Ts » and we have the control », +, +%, t%, = R , 
where K designates the sum of the fifth column 

17. EXAMPLES 

I. Let the observations (77= 30) 

& = (15, -18, -7,-5,-3, @ *3,+6, 18, +18) 
be given, and let us first assume the normal type. The normal law 
of error being symmetric, we have 77 = 777,= 7,20, and in this case 


we are able to _ down the ae gee coefficients required: 
a: X, a, = 0 ae ot Xe _ aig XG +m XS 


? x Vy Wy — 7S ? A, = m1,(m,™, - my ~) ' 
enti we © wil proceed according to No. 15. But we will 


confine ourselves to give the resulting data of the different steps 
of computation only. A full reproduction of the complete process 
of reckoning is to be found in No. 19, dealing with a somewhat 
more general situation. 

In the KELLEY-Woop tables we find 


6 = 71644854 &,=-1036433  §=~ 764490 &,= —. 385320 

sae * oO 
&,= — 125661 & = 12566 & = 365320 6, 76449 
&, + 1.036433 E 2 144854 

















62 ONE-DIMENSIONAL DISTRIBUTIONS 


We obtain =! m7, =0 m= +.87979 117, = 0 
m,= +h 74062 711, = 0 M= +4 22829 
X= —,03000 X= +$7237 X=+.07317 X= +1,73577 
° / . 2 3 
E =.%7700; 
ji. & Ae 1 @ +.87777 0 
o a, @, a, |_| o +.87977 Oo +1.74662 
° © az, a,, P oO Cc +,96659 oO 
° o 0 4,, o oO + . 18456 
A,=/ d, = .93797 A,=.18315  A,=.88575 
' / oO +.37979 ° 
B -B -[0 j oO +1.97845 
oO © f oO 
> Yoo = 71 87979 ly ** . 
7 y © o ° 
B={e¢ / o + 1.97845 
oO oO ! oO 
= i ia =O ; 
. =O Yy, 7 797845 Veo > 


M=a,=-.03000 $= @,=+.93006 A,=+ 10127 a,=+ .O1110 
S =+ 10889 K=+4 .01193 
M =.99382 Ma= .99953 "= .99959 


For comparison we give the value which is furnished by the 
traditional concept of dispersion : 


hZOSMY =$E- ay = .73600 


II. Let the same observations as above be giveu, but now let 

us assume the step type. We can make use of the vectors in No. 13, 
which give at once 

M =a, =-,03000 J=a =+.92087 G,=+.10184 G,=+.12529 

S$ =+.-M059 K=+.13606 

M,=.98383 (= .98989 M,= .9994! 

We note that for our observations € the normal type is, up 

to the degree 3, better than the step type. 
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18. ANALYSIS OF FREQUENCY GrRouUPsS 
In economic statistics, observations very often do not appear 
in the form dealt with in the preceding chapters. Instead, they 
usually are gathered into groups, so that there is given a set of 
values x, (X,¢-+-: (x, anda set of corresponding positive values 
N,,°°°°,AM , not necessarily integers. If A/ means the.sum of 


TL 


N+-- —_ , the ratios 
_ N _ NV. 
=<: £eQ.-+---- hoy 
are called the “frequencies” of the ‘‘observations”’ oe a 
The frequencies satisfy 


f>o, i >e,---,£.>4 and ft---+f = 


We shall now have to extend our developments to make them 
applicable in situations as stated above. To anyone who is familiar 
with integrals and sums in the sense of Stieltjes, it is clear that no’ 
special difficulty can arise. 

Again we have to start from a frequency function (a), and 
to agree which values c i ‘.. should be designated as “best 
values”. Reflections similar to those of No. 2 make it reasonable 


to choose 


B=e(e8), 8 PS+eh), Se lerkrss)s 
er ? SAF W (Et he--+h t+ ohn )- 


Apart from the best values, we only have to modify the defini- 
tion of the product of vectors (No. 4). We define 
Ab A= UV Sf t Uys + +4 My Fn- 
If these modifications are kept in mind, all the definitions, theorems, 
proofs and remarks of Nos. 4 - 16 remain unaltered. Of course, 
the abbreviations m, and a (No. 15) must now be read 


mz6 f+ Spee & i, (a= 6,1,4:-:) 
X,- G6 * Em ft =the ¥%,f, W-g4--«), 


and the ‘controls A-D (No. 16) 
a 
A, oe tadmim) ems ZUr8,)f 


° Vet 
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m+ (mt m5) om, = ZB EL 


© 8 X+3(K%+K%)+X, Ber (48) of 
2 
D. 14246 = Z (4x, )f. 
We are now in the position to cian the mechanism of 
skewness and kurtosis still more impressively than in No. 13. For 
this purpose we start from the frequency curve represented in Fig. 
3; we choose 7=9% and 


l, 3, 


-.8 -.6 -.4 -.2 0O +.2 +.4 +.6 +e 
Then the best values become equidistant, and they are given by the 
abscissae of the points marked by small circles: 
-8, -6,-4, -2, 0, +2, +4, +6, +38. 


The ordinates 7, of these points are proportional to W/, , namely: 


N, 
5 oe. 


The table below oe the corresponding vectors 3,,°---,}3 » 
: 7e + + x - 
and also the vectors 4 2 and 3* FFs as exam 


ples of distr distributions which show skewness ie show skewness ¢ or partons is in all — 


- 2.561 
+ .116 | 
+ 1.048 
+ .815 | 
.000 
- .815 | 
- 1.048 | 
- .116 | 
+ 2.561 


ef et 
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As in No. 13, let us illustrate the relations between the vector 
jy and the vectors 3,* 4 3, and 3,t4 3; by means of some fig- 


ures. This time however, we shall not only consider the components 
of the vectors, but also operate with the values {/ . We do that by 
associating every vector (u,,---,u,,) with the system of points 


(u, §,), (u,,f), ee ae (m,,, #,). 


Thus, in the figures 4a - 5b, the vector 3, is every time associated 


with the system of points marked by crosses, whereas the system 
of points marked by circles successively correspond to the vectors 


4At45, and 3,25 5 





The statements in No. 13 concerning the meaning of skewness 
as overconcentration to the left or to the right, and of kurtosis as 
overconcentration or underconcentration near the mean should be 
recognized. 

Until now, the values VW, were supposed to be really positive, 
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but there is no difficulty in allowing some of them to equal zero. 
Then, it is true, the formulation of some intermediary theorems 
must be changed. Yet, the existence and the main properties of the 
Tchebychef coefficients remain untouched, and their values are in- 
dependent of those x, for which the corresponding $, are equal 
to sero. To know this is sometimes useful in order to get a scheme 
of computation of the highest possible uniformity. 
19. EXAMPLE 

To conclude, we reproduce the reckoning of an example, fre- 
quently discussed, concerning observations of the right ascension 
of the pole star (see: A. L. BowLty, Elements of Statistics, 4th 
ed., p. 255). The given data are 


oa 
Mi -7 -6 -5{-4 -3 0 +1 +2 +3) +4 +5 +6 


N,: 1 6 121 21 36 61 73] 82 72 63 38 


and the normal type shall be assumed. 


Because the function 


« if 
GFay= Fee 
satisfies A4, = 0, AL, = 1, it will be suitable to start from the best 
values of this specimen. These best values Eas: e stretch from 
with 
which we intend to work, in coextension with € a? c. , we 


-3 to+3 approximately. In order to have the values x, , 


choose 


+2 x” ie. xy = 2%, (vated). 
3etween the means M , M* and the dispersions © , ©” of the 
observations X, , x there exist the connections (theorem 9) 
Mt=2M, 0 =26, 
whereas the measures of skewness and kurtosis as well as the 
measures of approximation do not change at the transition from 
X, to x™ (theorems 11 and 12). 
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.00205 339 

.01232 033 j 

02464 066 |-. .09195 232 

.04312 11 ; .10359 38 

.07392 20 .08719 22 10284 46 .12130 67 

12525 7 .09727 9 .07555 1 .05867 6 

14989 7 .05523 2 .02035 1 .00749 9 

.16837 8 + .00650 2 .00025 1 .00001 0 

14784 4 + .06647 5 .02988 9 .01343 9 

12936 34 |+ .11457 94 .10148 49 .08988 69 

.07802 87 |+.10747 95 .14804 60 .20392 37 

.03285 421 |+ .06240 756 | .11854 503 .22517 98 

‘01026 694 |t .02513 127 | .06151 597 .15057 79 
t 


00205 339 00632 941 -01950 990 -06013 77 


1.76125 5 

.97956 7 2.35026 3 
34314 13 .66287 0 1.28051 2 
24887 3 38574 5 59789 2 
14308 3 .16876 9 19906 6 
04597 0 .03539 1 02748 6 
00276 3 .00101 8 00037 5 
.00000 0 .00000 0 .00000 0 
00604 3 .00271 7 .00122 2 
.07961 4 07051 5 .06245 6 
28089 2 .38691 0 53294 3 
.42773 58 81249 7 1.54336 2 
36858 25 90221 1 2.20841 9 
18536 96 57138 7 1.76125 5 


+ 2.72531=m, | — .05851=7, |+ 12.32651=™, 


+ arrre + 1 
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x] x6 _[&xt lex lees | oe 


“0215 5 | .02515 4 
.08868 006] - . | .11088 3 
.11900 014] - . |  .15400 4 
13367 26 | - . | 17248 5 
.13078 83 | - . | 16632 4 
12525 67 | .09727 9 } 12525 7 
- 074949 | .02761 6 
.00000 0 | .00000 0 
.07392 2 | .03323 8 
11457 94 
16121 93 
12481 512 
.06282 818 
.01898 820 


' 


CE Er 
. ' 

bane ee 

bn seer e 


.04852 9} + 2.74027 2|+ 1.57279 4 

.80089 2)+ 5.48924 0}+1.60178 3 ‘ 

.42078 7} t 6.17137 8]+1.05196 7 .12577 0 
.13970 9} ¢ 4.09166 2|+ .41912 6 .03285 4 


+ 3.87570 14 20.31408 2.17710 


m,+3(mrm,)+m, = 3.87569 
™. 3( My tM ) + me. = 20.31408 
X.+ 3(X%,+%)+ Xz . = 5.94901 __ 
1+2X%,+ G = 2.17710 
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Computation of 4,,,°°° 


and of 


1 


Rae 
00113 


+ 2.72531 
- .00001 
+ 2.72530 


+ 2.72530 


? 


‘ 


a a 


ea Wns 


+ .96397 


+ 2.72531 
- .92924 
+ 1.79607 


+ 1.79593 


A, = .98182 
A, = 1.34012 
A, =2.14974 


+ .01237 
- .04614 


+12.32651 
- 00016 
+12.32635 


+ 2.72530 


- .04614 
+ .03319 
— .01295 


+12.32635 
- 7.70486 
+ 4.62149 


+ 4.62149 
- .00009 
+ 4.62140 


“ a, (twice underlined ) 


-;A, , with control by sums. 


+ 3.67532 
+ .00220 
+ 3.67752 


+ 3.61794 
- 1.87975 
+ 1.73819 


+14.98048 
+ .02502 
+15.00550 


+ 3.67752 


+ 1.73819 
+ .04479 
+ 1.78298 


+ 4.60856 
+ .01286 
+ 4.62142 


69 
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Computation of Base 2 with control by sums. 








Yo = +-.00113 

1 — .00113 + .96397 - 01283 + 1.95001 
+ .00113 — .00001 + .00319 + .00431 

1 0 + .96396 — 00964 t 1.95432 













+ 2.82716 | + 2.81497 


Yao = —-96396 
Ye, = +-01218 
i 0 + .9639% — .00964 | + 1.95432 
— .96396 + .00695 | - .95701 


0 + _ .99731 


1 — .01218 + 2.82716 
+ .01218 -— .00009 
0 + 2.82707 


+ 3.81497 
+ .01209 
t+ 3.82706 








Yoo = + .00269 
Y,, = - 2.82707 
TY, =+ -00721 





Computation of ,---',+%% » with control by sums. 












1 + .00113 — .96396 + .00269| + .03986 
1 + .01218 ~ 2.82707 | —- 1.81489 

1 + .00721 | + 1.00721 

1 + 1.00000 










— .08522 - .00010 +.08214 ~- .00023} ~ .00340 
+ 1.13486 + .01382 -— 3.20833 | - 2.05965 

— .17021 - .00123 | — .17144 

+ 3.14028 | + 3.14028 













- .08522 +1.13476 —.07425 
=% =, =X 7% 
to +%yZ = + 90578 





+ .90579 
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Finishing computations. 


M= a,= 08522 6: @,=1.15577 a= ~.05541 a= — 03234 
M™= 1704 «6 = 2.31154 =S=-.04794 K=— .02799 


f= 1.34754 ap = 1.33580 = 99666 ™ = .99833 
a= .00726 a, = ,00307 

€- ay = 1.34028 a7+a> = 1.33887 = 99805 AL = 99947 
a, = .00105 


atparra = 1.33992 M,=.99973 MM; = .99987 

So long as we pay regard to the Tchebychef coefficients a,, 

- @, only, the purport of our results is that the observations 
are somewhat overconcentrated to the right, and somewhat under- 
concentrated near the mean. The sum of the squares of the Tcheby- 


chef coefficients with higher subscripts than 3 is 
“— 2 
Onn 6 ta,= 6 -— (ap+---taq) = .00036; 


> oe 

it is small compared with @, = .00307 and a, = .00105. The 
vectors i" *% i. being normalized, we are sure that the 
influence of Q,,---, &,, cannot essentially disturb our statements. 

Finally we give an illustration by computing and drawing the 
“best curve” of the normal type, corfesponding to the observations 
X,. With it we mean that curve y=2 g( =f ) , the best values 
of which are the components of the vector a, Fat 2H - The 


values %,/3 (see No. 2) have to satisfy 


PE, + «&, ” Fo Fo t % Gy ) 
6. , a - x (¥,&+G) 


substituting 


Jo 


rr 


we get 


1b- (6K ENE, + for LYS oe 





a et a a a en ae = oe 


nS Ranbir tern 


ONE-DIMENSIONAL DISTRIBUTIONS 


=+1.17717 


i 
a., et 
fp .* = - .08389. 


With these values °~ and fp , the curve in Fig. 6 represents the 
function 3) 
‘ e Rey 
4 ~  ¢Var 


The abscissae of the points marked by circles are the observations 
X, , their ordinates are equal to the corresponding £ divided by 
the length 0.5 of the group intervals. 


University of Kiel, Germany. 
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A New Type of Average for Security Prices 
The market averages that are most popular with the American 
investing public are essentially weighted or unweighted means of 
security prices at designated intervals. As a rule, they ignore 
the volume of sales—an element to which experienced traders 
attribute considerable importance. Such averages endeavor only 
to reflect the average price level at periodic intervals, and all of 

those published are entirely satisfactory in this respect. 


In this note we shall discuss an acquisition average which, in- 
stead of being concerned with the price level,at a given moment 
attempts to answer the question, “what-is the average price actu- 
ally paid for the securities by their present owners.” 


The problem can best be appreciated by presenting two ex- 
amples of acquisition averages prior to the mathematical theory. 
The first entry of Table 1 states that for-the week ending January 
7, 1928 United States Steel common closed at 150 6-8, and that 
the acquisition average on this date was $137.75. At the time 
of the market crash in October, 1929, the acquisition average had 
risen to about $212, and at the present moment this average has 
receded to about $48. Of course, some of the individuals who 
bought Steel at about $200 per share are still holding on to 
it, whereas others among the present holders obtained theirs in 
1932 at less than $25 per share. According to our theory, the 
mean of such acquisition prices is the $48 noted above. 


As an illustration of corresponding averages computed on a 


daily basis, table 2 presents the daily closing prices and acquisi- 
tion averages for Auburn, covering the last half of 1934: This 
stock was selected because of its relatively small capitalization and 
frequent activity. 











30 4137.3 1140,39 
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146.0 
147.4 


| 138.42 
| 138.80 
146.3 | 139.19 
143.4 | 139.42 
145.7 | 139.60 
140.0 | 139.70 
140.0 | 139.71 
140.2 x | 138.00 
144.7 | 138.33 
1147.7 | 138.99 
1147.5 | 139.83 
1147.3 | 140.35 
1147.1 1140.51 

1141.09 
141.34 
141.48 
141.58 
141.90 
142.10 
142.22 
140.70 
140.82 


1150.0 
1145.5 
145.3 
148.0 
1148.6 
1145.6 
146.7 
1145.2 x 
1140.6 








1138.5 1140.75 
1133.2 |140.52 
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TABLE 1. 


ACQUISITION 





2-2 |184.7 
9 1173 


188.0 
3 181.1 
0) 183.6 
> 1186.3 
13. 1188.5 
20 186.0 
27 =: 1186.2 
4 182.1 
1] 179.4 
18 {174.6 
25 





) 
3 

-3- 2 iss *|1on98 
QO 


166.98 
168.58 
1169.51 
1170.46 
172.06 
173.52 
174.4] 
174.79 
175.20 
175.33 
175.38 











AVERAGES For U. S. 
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169.1 
169.5 
| 169.6 
179.0 
184.4 
| 1826 
186.0 
183.0 
184.3 x 
181.0 
| 179.2 
| 187.6 
| 193.4 
| 196.4 
193.1 
1195.2 
188.0 
170.2 
172.6 
| 172.7 
i 





| 191.32 








192.63 


190.31 

189.43 

188.% 
188.59 
188.37 
188.15 
186.2) 
186.07 
185.88 
185.68 
185.97 
186.90 
186,% 
187.40 
187.5] 
186.73 
185.65 
185.30 





167.7. 1175.25 24 $172.0 | 184% 

6-1 165.0 x 173.28 31 | 173.5 x] 18287 
8 168.0 {173.16 6-7 |164.2 |1824 

15 1175.52 |169.40 14 1162.4 1181.4 
22 {180.6 {169.80 21 4155.2 |179,79 
29 $190.6 {170.90 28 |156.0 |17852 § 
7 ) 1196.3 1172.43 7- 5 j 157.7 177.99 
{202 3 1174.63 12 }160.6 | 177.4 

2 207.7. 1177.48 19 |166.¢ 1177.03 . 

7 [206 0 4179.85 26 1169.7 1176.83 
8-3 |2044 182.20 8-2 | 1662 |1766 
10 4218.0 {185.68 9 1159.4 | 176.03 

7 1238.5 {192.17 16 | 165.3 1175.52 

24 |258.2 |197.54 23 1168.2 |17529 
31 (256.4 x 1198.64 30 | 171.2 X} 173.38 

7 |247.4 |202.19 9-6 | 173.1 1173.34 
1233.2 | 205.4: 13 }.170.2 | 173.2 

1232.1 |207.7¢ 20 | 163.7 | 173.0 

\225.0 {210.12 27 1158.2 | 1722 

1217 1211.40 10-4 1156.6 | 17131 

1230.6 4212.56 } 11 | 148.4 | 169.4 

1209.0 4213.41 18 | 145.3 | 16825 

26 $203.4 1212.38 t 25 | 151.4 | 16723 
1193.2 {211.0 f i1l-1 145.6 | 166.39 

9 1171.0 1209.67 8 | 140.4 165.39 
1164 207.09 i 5 | 147.7 | 163.84 

z 67.0 4205.42 f 22 1147.2 | 163.12 

3 62.1 x!201.8 29 1145.4 | 162.89 
826 |200.3 H 12-6 |1423x110038 

14 1174.0 1196.87 ‘ 13 | 136.4 | 159.24 

21 4163.0 1195.11 b 20 | 140.5 ) 158.19 
28 (164.4 |193.7¢ t 27 | 136.6 4157.73 
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TABLE 1—(Continued) 
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TABLE 2. 


Dairy CLosInG Prices AND ACQUISITION AVERAGES FoR AUBURN, 
Jury 1st—Decemper 30rn, 1933. 


933 1933 1933 
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We shall now develop the theory on which the preceding tables 
were constructed. As a simple illustration let us suppose that 100 
individuals start an enterprise, that a total of 100 shares of stock 
are issued, and that each of the individuals purchases one share 
for $100. The total book value of the issue at the date of issue 
is therefore, 


V = #00x 1002 */0 000, 
and the acquisition average then is 


A= & « 4400.00. 


If the first transfer of stock resulted from the sale of a single 
share at 150, the total amount paid by the group now owning 
ali the issue is obviously 


V.= 99(100) + 150 = 10 050, 
and the new acquisition average is 


A= ‘ = (1-235) A,+ Z = 100.50 | 


‘ 100 
If somewhat later the next sale of stock is a single share at 


50, we may assume that 
V,= 99(100.50)+50 = 9999.50 


and consequently 


v, ‘te 
A= = (- do)A+ Z = 92.995 | 


Our first assumption is, therefore, that whenever the sale of a 
share of stock is recorded, it is equally likely that any one of the 
previous holders sold the share. More will be said of this assump- 
tion later. 

In generalizing, let us adopt the following notation: 


c designates the number of share units listed for an issue 


A, * is the acquisition average at a given initial date. 


#2, denotes the’ price at which the x-th unit of stock is 
sold following the initial date. 
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A, is the acquisition average immediately after the sale of the 
x-th unit. 


We have then that 
~* (1-L)A, +H 
= Ci--)A, t Bee (EVA +A(i-t t)+ Ze 
ie Gi-+) A, +B - (- ty A+ B- ty+ Belt )+B 


(1) A: (i-+)A, At tet) BUi-t) top Berit) + Be 

If we multiply both sides of this last equation by (i- +) and 
then subtract the oe — = (1), we obtain 
(2) = Alt ty: -#, (1- +) "e(irt) G4) + (i-ty “One -fy) + 

+O Ty h)+ B- 

We shall now make a second assumption, namely, that the 
prices vary linearly from © to X . To illustrate, if Steel closes 
one week at 54 and during the next week 100,000 shares are sold 
after which the close is 59, our assumption means that after 
20,000 shares were sold the quotation is 55, at 40,000 shares 
the price is 56, etc. Actually the price trend between two dates 
is not a straight line but rather a scattering of points. However, 
the linear assumption introduces compensating errors which have 
been found to result in only negligible variations in the resulting 
acquisition averages. We may write, therefore, 


os 4, = 42, + 4-t 
a -f * te 


and equation (2) then reduces to 


(4) A= Ali-tYoa ls oti eedba le te (i-ty EE] 





Rut since in practice both | and x are large integers we 





may write 
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L\~ ~ a 
(5), (i-z) =e, where } = a 


and (4) then: becomes 


(6) A, + A, + GR + f 7x 
where 


oni -hk 42 wil 
M wee, B24e-e", por EE 


Tables of o¢ , 8 and y _ have been computed for the 
interval —— rate-of-turnover, A . With the aid of these, tables 
1 and 2 are readily extended. A slight difficulty is encountered 


in determining the acquisition average at an initial point. At the 
outset it is necessary to assume two initial acquisition averages, 
one equal to the “high” at some point in the past, the other equal 
to the corresponding “low.” The true acquisition average cer- 
tainly lies between these two limits. It is necessary to start com- 
putations sufficiently far before the date of the first desired 
acquisition average so that the two series derived respectively 
from the highs and lows will converge to a single average. The 
length of the past experience period required will depend upon 
the rate of turnover of the stock. The activity in grains is fre- 
quently so great that the two series will converge over a period 
of two weeks. 

I wish to point out emphatically that this acquisition average 
is an average and nothing more. Like any other average its 
value depends largely upon the ability of the individual. using it. 
Although the use of this average might prove of value to an 
investor, it can not rightly be said that this is a forecasting 
formula. I doubt the existence of any valid method of forecast- 
ing—mathematical or otherwise. The acquisition average merely 
measures secondary phenomena, and provides a tool for recogniz- 
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ing an unfavorable condition that might very easily be changed 
into a favorable situation by any one of numerous causes, 
Thus, if the market quotation is greater than the acquisition 
average, it follows that the average owner of the stock in question 
has a “paper profit.” Moreover, since a sale is made when the 


owner of the stock and the prospective purchaser can agree on 


a price, and because of the peculiar psychology usually affecting 
one possessing a paper profit, the excess of the market price over 
the acquisition average tends through bidding to increase both 
prices and acquisition averages. This vicious circle carries prices 


too far in either direction until some “impressed force’ changes 


the trend abruptly. 

Since the price of a security at a given time depends upon 
the status of the entire market as well as the intrinsic value of 
that security, it follows that a general average for the acquisition 
figures for a number of the “market leaders” would probably be 
of value to certain investors. In fact, any of the popular market 
averages can be accompanied by corresponding acquisition aver- 
ages. 

Since in many cases fifty percent of the stock is kept to pro- 
tect control, it is evident that one might be justified in using 
one-half the share units listed for the value of | in formula 
(5) for A . Again, if one desires to investigate the status of the 
group operating on margins, the amount of the “floating stock” 
and the brokers’ loans must be taken into consideration in deter- 
mining A. 

In conclusion let me point out that under the most favorable 
conditions our method of determining the acquisition average can 
do no more than a 100% successful questionaire inquiring of 
stockholders the price af which each share was purchased. Of 
course all stockholders would not give such information if they 
could. and couldn’t if they would. 











